Go to JKU Homepage
institute-for-application-oriented-knowledge-processing
What's that?

Institutes, schools, other departments, and programs create their own web content and menus.

To help you better navigate the site, see here where you are at the moment.

Data Quality Measurement: Readability Dimension

Data Quality Measurement: Readability Dimension

Supervisor: a.Univ.-Prof. DI Dr. Wolfram Wöß
Co-Supervisor: DI Lisa Ehrlinger, BSc

Motivation and Challenges

Data is central to decision-making in enterprises and organizations (e.g., smart factories and predictive maintenance) as well as in private life (e.g., booking platforms). Especially in artificial intelligence applications, like self-driving cars, trust in data-driven decisions depends directly on the quality of the underlying data. Therefore, it is essential to know the quality of the data in order to assess the trustworthiness of the derived decisions.

A Java-based tool (QuaIIe) has been developed at our institute that analyzes different information sources and calculates metrics to estimate an information system's data and schema quality. Currently, it is possible to calculate metrics for the quality dimensions accuracy, correctness, completeness, pertinence, timeliness, minimality, and normalization. However, an investigation of the readability dimension, on both, schema- and data-level is missing.

Objective

The main objective of this master's thesis is to evaluate current DQ approaches for assessing the readability dimension in terms of its definition and possible metrics. Based on existing work, an approach should be developed how readability could be actually measured (e.g., including intelligent string-matching or dictionary-based approaches using tools like WordNet). The developed metric should be implemented and evaluated in the framework of our existing DQ tool QuaIIe.