Go to JKU Homepage
institute-for-application-oriented-knowledge-processing
What's that?

Institutes, schools, other departments, and programs create their own web content and menus.

To help you better navigate the site, see here where you are at the moment.

Graph Data Profiling

Student: Katharina Wolf     (2021)


Supervisor: a.Univ.-Prof. DI Dr. Wolfram Wöß
Co-Supervisor: DI Lisa Ehrlinger, BSc

Motivation and Challenges

Data profiling is the process of examining the data available from an existing information source (e.g., a database or a file) and collecting statistics or informative summaries about that data. Such a data profile could serve as basis for ongoing data quality measurement. Abedjan et al. (2015) provide a comprehensive classification of different data profiling tasks for relational data. At our institute, we developed a program called BlocK-DaQ (Blockchain-based Knowledge Graph for Data Quality Measurement), which allows to create a reference data profile for relational data. However, so far, there is no work on data profiling for data stored in graph DBs so far.

Objective

The aim of this thesis is to create a concept of how a data profile for a graph could look like, what information it contains, and in which format it should be ideally stored. In addition, a program should be implemented to automatically generate a reference data set for a Graph DB.