Zur JKU Startseite
Institut für Angewandte Statistik
Was ist das?

Institute, Schools und andere Einrichtungen oder Angebote haben einen Webauftritt mit eigenen Inhalten und Menüs.

Um die Navigation zu erleichtern, ist hier erkennbar, wo man sich gerade befindet.


Forschungsseminar am Institut für Angewandte Statistik

20. Oktober, 15:30 -Dr. Alejandra Avalos Pacheco: “Multi-study Factor Regression Models for Large Complex Data with Applications to Nutritional Epidemiology and Cancer Genomics”

zoom link, öffnet eine externe URL in einem neuen Fenster

meeting ID: 937 6054 7545

password: 946296


Data-integration of multiple studies can be key to understand and gain knowledge in statistical research. However, such data present both biological and artifactual sources of variation, also known as covariate effects. Covariate effects can be complex, leading to systematic biases. In this talk I will present novel sparse latent factor regression (FR) and multi-study factor regression (MSFR) models to integrate such heterogeneous data. The FR model provides a tool for data exploration via dimensionality reduction and sparse low-rank covariance estimation while correcting for a range of covariate effects. MSFR are extensions of FR that enable us to jointly obtain a covariance structure that models the group-specific covariances in addition to the common component, learning covariate effects from the observed variables, such as the demographic information. I will discuss the use of several sparse priors (local and non-local) to learn the dimension of the latent factors. Our approach provides a flexible methodology for sparse factor regression which is not limited to data with covariate effects. I will present several examples, with a focus on bioinformatics applications. We show the usefulness of our methods in two main tasks: (1) to give a visual representation of the latent factors of the data, i.e. an unsupervised dimension reduction task and (2) to provide a (i) supervised survival analysis, using the factors obtained in our method as predictions for the cancer genomic data; and (ii) dietary pattern analysis, associating each factor with a measure of overall diet quality related to cardiometabolic disease risk for a hispanic community health nutritional-data study.
Our results show an increase in the accuracy of the dimensionality reduction, with non-local priors substantially improving the reconstruction of factor cardinality. The results of our analyses illustrate how failing to properly account for covariate effects can result in unreliable inference.


Datum & Uhrzeit


15:30 - 17:00 Uhr

Meinem Kalender hinzufügen


S2 Z74, Science Park 2