Uncertain Data and Query Processing

Supervisor: a.Univ.-Prof. DI Dr. Josef Küng
Co-Supervisor: Dr. Maksim Goman

Introduction, Motivation

Currently, complexity of business processes grows, streams of data from sensors and interchange with third-party software accelerate and increase, big data analysis became
a routine activity, in-memory databases became normal practice. Complex business analysis is going to be performed real-time on every minor data update and control dashboards
are constantly updated. Uncertainty is usually present in many kinds of data, mainly in forecasts and estimations, imprecise measurements and ambiguous observations where
uncertainty may not be excluded. In relation to massive data streams, risk may originate directly from uncertainty in the raw data. Beyond risk analysis, applications of uncertain
data processing can be real-time scheduling of atomic operations (e.g. in manufacturing), quality control, monitoring of user satisfaction, worldwide fraud management in financial
transactions, etc. For instance, let us imagine a logistic supply chain in which products loose their value according to known distribution. For instance, a cold supply chain.
Certain parameters can influence the actual condition of each item of products of many types and this is accounted for in the distributions. Some of the parameters of all or some
items are regularly updated, although this information may have random error too. Real condition of each unit is unknown until it is sold or disposed. Nevertheless, possible loss
due to natural deterioration and need for additional delivery to certain hubs or regions can be computed probabilistically. Value of uncertain aggregation with some probabilistic
threshold can be used in a decision rule in a decision support or inventory management system as well as in certain cost/benefit or sales trend prognoses in ERP constantly.

Bachelor or Master Theses

Depending on the study background of a student, following Sub-Topics can be chosen in the frame of a Bachelor Thesis or Master Thesis:

1. Review state-of-the-art methods for uncertain data processing, existing uncertain query models and semantics
2. Syntax of an uncertain query similar to SQL in relational databases
3. Uncertain comparison operators for uncertain data analysis (<,>,=)
4. New model of an uncertain query using chance constraint method and SQL-like syntax
5. Principles and syntax for uncertain conditioning operators like GROUP BY or WHERE a>b and their implementation
6. Principles and syntax for uncertain aggregation operators (MIN, MAX, AVG, etc.) and their implementation

A student needs to review the problems and current methods of the chosen topic of interest.
Define a sample data set, build the model, and develop the artefact of the solution.
Demonstrate application of the developed solution to the data.
Discuss and summarize the results paying attention to comparison with alternative techniques.

Instructions and literature will be given by the supervisor

Successful thesis with a topic in our area is a ticket to a team of unique professionals who develop modern data processing and querying algorithms at companies and research institutions.

Name	Purpose	Lifetime	Provider
CookieConsent	This cookie saves your settings about cookie-handling at this website.	1 year	JKU
se_mode	This cookie is used for settings of the site search.	1 year	JKU

Name	Purpose	Lifetime	Provider
_gcl_au	This cookie is used by Google Analytics to understand user interaction with the website.	3 months	Google
_ga	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.	2 years	Google
_gid	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.	1 day	Google
_gat_UA-112203476-1	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.	1 minute	Google
_pk_id	This cookie is used to store a few details about the user such as the unique visitor ID.	13 months	JKU
_pk_ses	This cookie is a short lived cookie used to temporarily store data for the visit.	30 minutes	JKU
_pk_ref	This cookie is used to store the attribution information, the referrer initially used to visit the website.	6 months	JKU

Name	Purpose	Lifetime	Provider
_gcl_au	This cookie is used by Google Analytics to understand user interaction with the website.	3 months	Google
_ga	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.	2 years	Google
_gid	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.	1 day	Google
_gac_UA-112203476-1	Contains campaign related information for the user and measures the AdWords campaign success.	90 days	Google
test_cookie	This cookie is set to determine if the website visitor's browser supports cookies. Doesn't contain personal identifier.	15 minutes	Google
IDE	This cookie carries out information about how the end user uses the website and any advertising that the end user may have seen before visiting the said website.	1 year	Google
_gcl_aw	This cookie is set when a user clicks an ad to reach our website. It informs about the success of campaigns and allows to connect ads to conversion targets.	3 months	Google
AMCV_xx	This is a pattern type cookie name associated with Adobe Marketing Cloud. It stores a unique visitor identifier, and uses an organisation identifier to allow a company to track users across their domains and services.	3 years	LinkedIn
bcookie	Contains a browser ID.	2 years	LinkedIn
bscookie	Contains a browser ID for a secure connection.	2 years	LinkedIn
lang	This cookie is used to store the language preference of our visitors	Session	LinkedIn
lidc	This cookie carries out information about how the end user uses the website and any advertising that the end user may have seen before visiting the said website.	1 day	LinkedIn
lissc	This cookie is used to analyze how a visitor interacts with embedded services.	1 year	LinkedIn
UserMatchHistory	This cookie is set when a user clicks an ad to reach our website. It informs about the success of campaigns and allows to connect ads to conversion targets.	30 days	LinkedIn
fr	This cookie is set when a user clicks an ad to reach our website. It informs about the success of campaigns and allows to connect ads to conversion targets.	90 days	Facebook
fbp	This cookie is used to display advertisings, for example third-party real time offers.	90 days	Facebook
sc_at	This cookie is used to identify a visitor across multiple domains.	1 year	Snap
sc-country	This cookie is used to determine a visitor's country.	1 day	Snap
uid	This cookie sets a random User-ID and helps at real time bidding for display advertising to targeted audiences.	60 days	Adform
C	This cookie identifies if user’s browser accepts cookies. 1 – Cookies are allowed, 3 – Opt-out.	30 days	Adform