Skip to main content

Sustainable Data Curation and Dissemination for Thermodynamics

By Kenneth Kroenlein

Download (PPTX)

Licensed according to this deed.

Published on

Abstract

Exponential growth in publication rates and data generation in thermophysical properties has yielded tremendous challenges as well as potential rewards for data analysis groups. Data volumes have grown to such a degree that many traditional data collection and interpretation approaches cannot scale sufficiently to remain comprehensive and current, or to effectively track shifting interests within research and industrial communities. It is thus necessary to strongly rely on a substantially increased role for digital archives, automated analysis and machine learning approaches. 

The approach adopted at the Thermodynamics Research Center (TRC) at the National Institute of Standards and Technology (NIST) is dynamic data evaluation, whereby a reliable and comprehensive underlying data archive is used in conjunction with an algorithmically-encoded expert analysis in order to generate up-to-date data recommendations. These efforts have facilitated a decade's long collaboration with 5 major journals which report thermophysical and thermochemical property information, where reported data are vetted for consistency by TRC before being made available in a free and open context.  These data are disseminated via ThermoML [1], an XML-based file format and IUPAC standard that was developed in close collaboration with representatives from TRC, industry and academia, including journal editors.  Schema development was largely informed by real data sets culled from the open literature, thus ensuring compatibility with a broad range of target information.  Impacts from this collaboration and lessons learned from the development effort will be discussed.

[1] Frenkel, M.; Chirico, R.D.; Diky, V.V.; Marsh, K.N.; Dymond, J.H.; Wakeham, W.A.; Stein, S.E.; KoĢˆnigsberger, E.; Goodwin, A.R.H. “XML-based IUPAC standard for experimental, predicted, and critically evaluated thermodynamic property data storage and capture (ThermoML):IUPAC recommendations 2006.” Pure Appl. Chem. 2006, 78, 541−612.

 

Cite this work

Researchers should cite this work as follows:

  • Kenneth Kroenlein (2014), "Sustainable Data Curation and Dissemination for Thermodynamics," https://ncihub.org/resources/700.

    BibTex | EndNote

Submitter

Mervi Heiskanen

National Cancer Institute

Tags