The present work introduces HOMER (High Performance Measurement and Computing tool for Ontology-based Metadata Extraction and Re-use), a python-written metadata crawler that allows to automatically retrieve relevant research metadata from script-based workflows on HPC systems. The tool offers a flexible approach to metadata collection, as the metadata scheme can be read out from an ontology file. Through minimal user input, the crawler can be adapted to the user’s needs and easily implemented within the workflow, enabling to retrieve relevant metadata. The obtained information can be further automatically post-processed. For example, strings may be trimmed by regular expressions or numerical values may be averaged. Currently, data can be collected from text-files and HDF5 files, as well as directly hardcoded by the user. However, the tool has been designed in a modular way, so that it allows straightforward extension of the supported file-types, the instruction processing routines and the post-processing operations.

Freie Schlagworte

Metadata extraction

HPMC

Ontology

Research Data Managem...

Sprache

Englisch

Fachbereich/-gebiet

16 Fachbereich Maschinenbau > Institut für Fluidsystemtechnik (FST) > Forschungsdatenmanagement und digital literacy

DDC

600 Technik, Medizin, angewandte Wissenschaften > 620 Ingenieurwissenschaften und Maschinenbau

Institution

Universitäts- und Landesbibliothek Darmstadt

Ort

Darmstadt

Titel der Zeitschrift / Schriftenreihe

ing.grid : FAIR data management in engineering sciences

Jahrgang der Zeitschrift

Heftnummer der Zeitschrift

ISSN

2941-1300

Institution der Erstveröffentlichung

Universitäts- und Landesbibliothek Darmstadt

Ort der Erstveröffentlichung

Darmstadt

Publikationsjahr der Erstveröffentlichung

2024

Verlags-DOI

10.48694/inggrid.3983

Zusätzliche Infomationen

2022 NFDI4ing Conference Special Issue

Ergänzende Ressourcen (Forschungsdaten)

https://gitlab.lrz.de/nfdi4ing/crawler/-/tree/master/SimpleApplication_PizzaOntology

https://doi.org/10.14459/2022mp1694401