Adaptive Data Quality Management for Evolutionary Data Clouds (DQ@Healthineers)

Third party funded individual grant


Acronym: DQ@Healthineers

Start date : 01.01.2023

End date : 01.01.2026


Project details

Short description

We will create a generic data quality framework which can be deployed into an evolutionary data lake environment which makes data quality quantifiable and can direct efforts for data quality improvement.

Scientific Abstract

We propose to investigate the following research questions:

- Which characteristics enable a data quality framework to best identify and cluster the most relevant data quality problems in arbitrary business data landscapes?

- Can we capture the knowledge about typical data quality concerns and possible solutions in a knowledge graph in order to infer potential solutions in any given case?

- How can the data quality metrics in such a framework be designed in general to align well with fitness for use in different business contexts?

To address these questions we will investigate the types of data quality problems that occur in such an environment. We will also investigate and compare possible methods to systematically detect and monitor such data quality problems. We will conceptualize a framework for data quality monitoring based on an extensible metadata schema for data quality concerns.

We will extract and classify relevant generalizable data quality problems. Furthermore, we will examine the limitations of such a framework regarding transferability to different IT landscapes. We will develop a set of tools and methods which solve the data quality reporting problem independently from the specific environment. We will evaluate our proposed framework and adaptation strategy through a proof-of-concept implementation.

Involved:

Contributing FAU Organisations:

Funding Source