Authorea

Robert Davey edited sectionProgress_Coll.tex almost 9 years ago

Commit id: b44a2526910e3e1b2885e155c23b5c2427aa4bae

deletions | additions

The second pathway requires the user to login. By using technologies such as ORCiD \url{http://orcid.org/}, Twitter \url{https://twitter.com/} and OAUTH \url{http://oauth.net/}, a user will be able to login to COPO using existing credentials thus negating the need to create a new account and offloading some of the burden of security onto existing and trusted industry standard methods. Additionally, services like ORCiD will enable COPO to federate a researcher's existing information into their COPO Profile, such as professional contact details, previous publications and collaborators. Once logged in, all the query capability described above will be available, as well as the ability to create COPO Profiles. A Profile can be thought of as the digital location of a complete body of research, fully attributed to one or more researchers. Responsive web forms allow the Profile to be properly labelled with metadata relating to creators, contributors, institutions, subject, sample, methodology, etc. To a Profile can be added Collections of objects. Such Collections are delimited by file type or function. For instance, one collection may contain all the sequence data associated with a study. Another collection may contain a number of published manuscripts and yet another might contain references to a number of source code repositories or analysis workflows. When creating these Collections, essential metadata can be attributed which properly describe the objects therein. By taking this metadata and integrating it with existing ontologies, COPO not only indexes the research objects passing through, but semantically enriches these objects. They are no longer simply collections of unstructured unrelated data, but entities described in terms of their similarity to other existing objects. It then becomes possible to make inferences about the kinds of things a researcher might be interested in, based on the samples, studies, manuscripts, source code, file types, abstracts, methodologies, references, and institutions, which reside within researchers' research objects. Since COPO is a brokering service, the raw data within a Profile is not physically stored on its servers for extended periods, as would be available in an archival service. Rather, once a collection of research objects has been uploaded to and labelled within a COPO Profile, they are seamlessly deposited into the relevant public repositories. Such repositories return unique accessions which are then be used to identify the deposited data files. These accessions are stored within the COPO Profile alongside the user-supplied metadata. If the data files are subsequently needed as part of a user query, they can be easily and quickly downloaded again from the repository. repository to COPO's infrastructure, ready to be used in any subsequent analysis. So, COPO's efficient and intuitive web interfaces allow for the input of important metadata, and by minting DOIs (Digital Object Identifiers) that identify COPO Profiles, these metadata can be published as persistent first class entities on the Internet. A resolution service, such as dx.doi.org, will direct users back to the COPO Profile identified by the DOI. Therefore, whether the DOI appears alongside a dataset, a paper or a code repository, the user can see all related research objects in a single view without having to search repositories and websites individually. Since DOIs guarantee data permanence, the system will alleviate the problem of "link decay" which occurs when resources referenced by a URL become permanently unavailable. The mapping of DOIs to research objects additionally allows for usage of these objects to be directly tracked and referenced enabling researchers to be properly recognised and credited for all the outputs they produce, not just by the typical route of publishing a paper which is not a comprehensive, truly digitally accessible, representation of a whole body of work. Since depositing objects through COPO naturally builds a large interconnected graph of metadata, ontological inferences can be made and used to suggest for example, further literature searches, experimental procedures or comparable datasets.