Standards Enabling Data Discoverability, Exchange and Reuse

As noted in the previous section, standards for data and metadata provide the basis for a structured data archive, enabling the rapid discovery of data and assisting in determining the data’s relevance and usefulness. At the most basic level, good data practice generally requires the generation, and acceptance, of a vocabulary defining the terms used to describe reported data. This assures the data user they precisely understand the context of the data they are reviewing. From this level, other attributes, features, or requirements can be levied on a data management system including ontologies, schema and formats.

Other fields have studied these issues as a community, and MSE is now starting to develop a concerted effort to define its approach to setting data standards. The European Union is studying the creation of standards for exchange of engineering materials data through the European Committee for Standardization.\cite{Austin_2013} The target for these standards is structural materials with an early emphasis on aerospace applications. And the European Commission is funding a broader activity called the Integrated Computational Materials Engineering expert group (ICMEg) with the aim of developing the standards and protocols needed to support the digital exchange materials data needed to conduct ICME.\cite{Schmitz_2014} ASTM International had issued data standards relevant to materials in the early to mid-1990’s, but those standards have since been abandoned, likely because they were ahead of true need. However, ASTM International has been reviving its efforts in providing guidance on the digitization of materials test data by exploring the re-establishment of it’s Computerization and Networking of Materials Databases Symposium Series.\cite{Rumble_2014} These two efforts address a relatively narrow, but industrially important, segment of materials data. Several recent papers are starting to propose standards for other types of materials data to include thermodynamic and image based data.\cite{Jackson_2014}\cite{Campbell_2014} There are also closed-loop approaches to materials data standardization that exist within commercial data management software packages, Granta Design is one example, but these are not generally available to the public.

While the field of information technology is continuously evolving to provide solutions to more productively use unstructured data, at present there is no community-wide accepted practice for MSE data and metadata standards. Near-term solutions for governing the archiving of materials data will need to be relatively loose, flexible, and evolutionary with a drive toward more standardization. While publishers may not be able to directly provide data repository services, they are reasonably well positioned and willing to aid the community in establishment of data standards. Concerning the pursuit of standardization across a technical field, Michael Whitlock, a primary champion of journal data archiving in the field of evolutionary biology, offered this quote from Voltaire based on his experience: “the perfect is the enemy of the good”.\cite{Whitlock-pc} It is perhaps much more important at this stage of our digital maturity that MSE first implement data archiving with the best guidance available, and work to build in standardization over time.