Chuck Ward edited Data Repositories.tex  about 10 years ago

Commit id: c4a1f7c8bfa7cf361de278979407c140585e1430

deletions | additions      

       

\subsection{Data Repositories}  Aside from crystallographic data repositories, there are at this time perhaps no dedicated materials data repositories that meet the required characteristics defined above. The materials science and engineering community does have numerous publically-accessible data repositories; however, the majority of these are associated with specific projects or research groups, and their persistence is therefore dependent on individual funding decisions. These repositories are primarily established to house and share the research data generated within a specific project or program. They generally don’t follow uniform standards for data and metadata, or nor  provision for data discoverability and citation. There are very few repositories established with the explicit objective of providing MSE with public repositories for accessible digital data. In short, publically accessible, built-for-purpose repositories and the associated infrastructure for access, safe storage and management still need to be developed—this is the largest impediment to implementing viable data archiving policies. Evolutionary biology, for example, allows a mix of repositories that meet established criteria. Such criteria may be as simple as requiring data cited to be permanently archived in data repositories that meet the following conditions:  \begin{enumerate} 

  In all likelihood, like biology, MSE publications will be dependent on a collection of repositories that are tailored to specific materials data. For example, NIST is building and demonstrating a data file repository for CALPHAD and interatomic potentials.\cite{NISTMDR} These may be expandable and largely sufficient for thematic journals such as those devoted to thermodynamics and diffusion. However, repositories such as this will only fill a relatively small niche need in MSE.    Finally, a business model for sustainably archiving materials data is required. Other technical fields, such as earth sciences, can at least partially rely on government-provided repositories for large and complex datasets. Without these types of repositories to build on, MSE will need to establish viable repository solutions. In response to funding agency requirements for data management plans some universities, Johns Hopkins for example, are beginning to provide centrally-hosted data repositories, but these are not yet common.\cite{jhudata} Private fee-for-service repository services, such as labarchives and figshare, are also evolving to meet growing demand for accessible data storage.\cite{labarchives,figshare} Additionally, ASM International is working to create a prototype materials data repository through its close association with Granta Design. Termed the Computational Materials Data Network (CMDN), this is an interesting option as the data repository will provide a structured database specifically for materials data data,  but the business model for CMDN has not yet been solidified.\cite{cmdn} A key open question remains how funding agencies will respond to the OSTP open research policy memo, and how they will fund activities making data open to the public.