DPG Phi
Verhandlungen
Verhandlungen
DPG

Dresden 2026 – wissenschaftliches Programm

Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe

FM: Fachverband Funktionsmaterialien

FM 6: Focus Session: Materials Discovery I – Material informatics

FM 6.5: Vortrag

Dienstag, 10. März 2026, 10:45–11:00, BEY/0138

Finding interoperable datasets in diverse databases via provenance and similarity analysis — •Martin Kuban, Alvin Noe Ladines, Thea Denell, Lauri Himanen, Joseph F. Rudzinski, Claudia Draxl, and FAIRmat team — Physics Department and CSMB, Humboldt-Universität zu Berlin, Germany

Collecting data from different sources has the potential to significantly increase the amount of available data for data-driven discovery. However, different producers of data use distinct methods and setups, e.g., approximations and parameters used in computational data, to achieve the best data quality for the properties that are studied in a specific project. Bringing these data together requires to understand the impact of the method and setup on the accuracy and precision of the produced data. In order to do so, two key requirements must be fulfilled: First, the provenance and metadata of each data point need to be recorded. This can be achieved by leveraging the NOMAD infrastructure[1, 2], an ecosystem of parsers, schemas, and workflow tools, to extract rich metadata and provenance information. Second, using this information, similarity metrics can be used to identify data that achieve similar precision besides distinct computational setups[3]. We showcase our approach on different examples using data from NOMAD.
[1] Draxl and Scheffler, MRS Bulletin 9 (2018), 676-682.
[2] Scheidgen et al., Journal of Open Source Software 8 (2023), 5388.
[3] Kuban et al., MRS Bulletin 47 (2022), 991-999.

Keywords: data management; data-driven discovery; similarity

100% | Mobil-Ansicht | English Version | Kontakt/Impressum/Datenschutz
DPG-Physik > DPG-Verhandlungen > 2026 > Dresden