GEOMAR Conference & Event Management

28.–30. Apr. 2026
DKFZ, Heidelberg
Europe/Berlin Zeitzone

Toward FAIR and Reproducible Data Quality Control: A Use Case–Driven Data-Quality Processing Metadata Schema for Time Series Data

Nicht eingeplant
10m
Communication Center (DKFZ, Heidelberg)

Communication Center

DKFZ, Heidelberg

Im Neuenheimer Feld 280 69120 Heidelberg, germany
Talk 3. Ontology-Driven Metadata Harmonization: Closing Semantic Gaps TALK SESSION

Sprecher

Dr. Ulrich Loup (Forschungszentrum Jülich GmbH)

Beschreibung

High-quality environmental time series data require transparent, reproducible, and well-documented quality control (QC) workflows that integrate automated procedures and expert judgment. While many QC frameworks offer algorithmic methods, the processing information explaining how data quality decisions are made — including parameterization, flag semantics, and manual interventions — is often not formalized enough to be easily reused, reproduced, or exchanged across infrastructures.

In this talk, we present a metadata schema for time series data that enables FAIR and reproducible data quality processing. The schema is designed to describe QC methods, execution contexts, and resulting quality flags in a machine-actionable and interoperable manner. It employs the OGC SensorThings API data model enhanced by the STAMPLATE schema and the concepts established in the SaQC framework. The schema follows the linked-data approach and aligns with standards such as the W3C Data Quality Vocabulary.

The design of the proposed schema is motivated by concrete use cases for QC of time series data from the TERENO and ACTRIS observation networks. These use cases include detailed analyses of existing automated and manual QC workflows. By comparing and abstracting these practices, we derive common requirements and design patterns for representing QC processing information in a FAIR and reproducible manner. The resulting schema can be used straightforwardly with SensorThings API services and mapped into NetCDF files that align with the Helmholtz Metadata Guidelines for NetCDF. It can also be used with RO-Crates, embedding files in CSV format, for example.

Our metadata schema lays the foundation for a community-driven, FAIR, and reproducible quality control solution. Our goal is to integrate the requirements of other communities and develop a web application that allows users to visually inspect and flag time series data in a manner consistently with our schema.

Alternative Track 2. Empowering Research Communities: Turning Metadata into Action

Autor

Dr. Ulrich Loup (Forschungszentrum Jülich GmbH)

Co-Autoren

Herr Benjamin Louisot (Karlsruhe Institute of Technology) Christof Lorenz (Karlsruhe Institute of Technology, Karlsruhe, Germany) Herr David Schäfer (Helmholtz Centre for Environmental Research) Dr. Jannis Groh (Forschungszentrum Jülich GmbH) Herr Jürgen Sorg (Forschungszentrum Jülich GmbH) Marc Hanisch (GFZ Helmholtz-Zentrum für Geoforschung) Herr Martin Ingenbleek (Forschungszentrum Jülich GmbH) Frau Nicole Büttner (Karlsruhe Institute of Technology) Dr. Ralf Kunkel (Forschungszentrum Jülich GmbH) Herr Robert Wiesen (Helmholtz Centre for Environmental Research) Romy Fösig

Präsentationsmaterialien

Es gibt derzeit keine Materialien.