Sprecher
Beschreibung
Computational Science and Engineering relies on complex, multi-step workflows that combine simulations, data processing, and parameter-driven analyses across heterogeneous environments. Ensuring reproducibility in such settings requires not only abstract workflow descriptions but also semantically rich metadata that is interoperable across domains.
In this work, we present MaRDIFlow, a lightweight, metadata-driven workflow framework developed within the MaRDI consortium for research data management in the mathematical sciences. MaRDIFlow executes workflows through explicit input–output relationships between components, enabling structured metadata descriptions at different abstraction levels. Redundant representations of models, code, and data are supported to strengthen reproducibility and reuse.
To address semantic interoperability, MaRDIFlow integrates domain specific ontologies via RESTful APIs and SPARQL endpoints. This allows workflow components and their metadata to be dynamically aligned with standardized vocabularies during both construction and execution. As a concrete example, we integrate Voc4Cat, a domain-specific ontology and SKOS vocabulary from the NFDI4Cat consortium, which serves as a semantic backbone for annotating workflow components and metadata dependencies. Through this integration, knowledge graphs are used to represent and query relationships across workflow layers, supporting automated discovery, validation, and consistent interpretation of data.
The presented use cases demonstrate how combining workflow descriptions with domain ontologies enhances semantic consistency, interoperability, and reproducibility. This work highlights the practical role of domain and application ontologies in building reusable data infrastructures for computational workflows and lays the groundwork for extending MaRDIFlow with additional NFDI ontologies across disciplines.