LIT AI Lab/ELLIS Unit Linz Seminar.

The LIT AI Lab kicks off the “LIT AI Lab/ ELLIS Unit Linz Seminar” in summer 2021, which shall bring together researchers and students interested in the field of Artificial Intelligence.

You can follow the talks online via zoom and/or in person (3G rules apply) as mentioned below.

Abstract
Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks. While it has been empirically observed that flatness measures consistently correlate strongly with generalization, it is still an open theoretical problem why and under which circumstances flatness is connected to generalization, in particular in light of reparameterizations that change certain flatness measures but leave generalization unchanged. We investigate the connection between flatness and generalization by relating it to the interpolation from representative data, deriving notions of representativeness, and feature robustness. The notions allow us to rigorously connect flatness and generalization and to identify conditions under which the connection holds. Moreover, they give rise to a novel, but natural relative flatness measure that correlates strongly with generalization, simplifies to ridge regression for ordinary least squares, and solves the reparameterization issue.

Physical place: Seminar Room SP3 318

Zoom:
https://jku.zoom.us/j/99116069385?pwd=VnZJN3Ntd2wyWGRSSU5HSVY5My9GQT09, opens an external URL in a new window

Meeting-ID: 991 1606 9385
Password: 0815

Physical place: Seminar Room SP3 318

Online via https://jku.zoom.us/j/98246734423?pwd=QXJ4ekJBUkNHb3M0bXFKVEVuQ1h1dz09, opens an external URL in a new window

Abstract
Many real-world problems can be formulated as the task of counting the models of a propositional formula, called #SAT. A model of a formula is an assignment to its variables such that the formula evaluates to true. Related to model counting is model enumeration, or All-SAT, in which the models of a formula are recorded.

In this talk we are going to present our work on the formalization of methods for propositional model counting and enumerating with focus on finding short partial models under projection. In partial models not all variables of the formula occur. Partial models therefore represent a set of total models and enable a concise representation of a formula. Our methods return a disjoint Sum-of-Products (DSOP), which is a formula in disjunctive normal form (DNF) whose disjuncts are pairwise disjoint. This ensures models are enumerated only once. We also devised a variant in which this uniqueness constraint is relaxed to serve tasks in which repetitions cause no harm. This relaxation results in detecting even shorter models.

The focus in our work is on formalization and proofs. Preliminary results, either theoretical or experimental, show that the methods presented in this thesis enable us to find short partial models. Propositional model counting has been used, e.g., in frequent itemset mining, and there is a substantial amount of work concerned with learning DNFs from examples. The aim of this talk is to identify possible applications of our work in Artificial Intelligence.

Video of the Talk: https://youtu.be/Q9R6qm5iv1k, opens an external URL in a new window

To participate enter with the following Zoom-Link:

https://jku.zoom.us/j/91383246249?pwd=eVcydG1mT1FZazNqUUcxeFFqeEYrdz09, opens an external URL in a new window

Meeting-ID: 913 8324 6249
Passwort: 271828

Abstract
In machine learning and especially in deep learning there is one algorithm that, including many of its variations, is used almost universally for training large and non-linear models: stochastic gradient descent (SGD).
Applying a SGD method for minimizing an objective gives rise to a discrete-time process of estimated parameter values. While the mathematical description is fairly simple, the behavior of the algorithm generally is not. In order to better understand the dynamics of the estimated values it is reasonable to approximate the discrete-time process with the solution of a differential equation. The resulting gradient flow equation describes the mean evolution of the SGD process very well. However, it does not account for the noise inherent in the SGD method.
For example it does not see the difference between different mini-batch sizes or between having an infinite list of fresh data versus a finite sample of data. To rectify this issue one can introduce a noise term to the gradient flow equation, turning it into a so called stochastic differential equation. A solution to the resulting equation is called a diffusion approximation to SGD.

In this talk we describe how to explicitly calculate and compare the errors of gradient flow and the so called first-order diffusion approximation. Further, we show that one can find an even better, second-order diffusion approximation. Finally, some applications of diffusion approximations are explored.

Abstract: Despite the recent successful application of Artificial Intelligence (AI) to games, the performance of cooperative agents in imperfect information games is still far from surpassing humans. Cooperating with teammates whose play-styles are not previously known poses additional challenges to current state-of-the-art algorithms. In the Swiss card game Jass, coordination within the two opposing teams is crucial for winning. Since verbal communication is forbidden, the only way to transmit information within the team is through a player’s play-style. This makes the game a particularly suitable candidate subject to continue the research on AI in cooperation games with hidden information. In this work, we analyse the effectiveness and shortcomings of several state-of-the-art algorithms (Monte Carlo Tree Search (MCTS) variants and Deep Neural Networks (DNNs)) at playing the Jass game.

Zoom link: https://jku.zoom.us/j/98246734423?pwd=QXJ4ekJBUkNHb3M0bXFKVEVuQ1h1dz09 , opens an external URL in a new window

Abstract: Convolutional Neural Networks (CNNs) have been solely dominating the field of computer vision for nearly a decade. In this talk I will present two recent papers that propose new and highly competitive architecture classes for computer vision. In the first part I will present the Vision Transformer model (ViT), which is almost identical to the standard transformer model used in natural language processing, but happens to work surprisingly well for vision applications. In the second part of the talk, I will present the MLP-mixer model: an all-MLP architecture for vision. It can be seen as a simplified ViT model without the self-attention layer. Nevertheless, it also demonstrates strong results across a wide range of vision applications.

Zoom link: https://jku.zoom.us/j/98246734423?pwd=QXJ4ekJBUkNHb3M0bXFKVEVuQ1h1dz09, opens an external URL in a new window (Meeting-ID: 982 4673 4423, Passwort: 550113)

Abstract: Classical machine learning (ML) provides a potentially powerful approach to solving challenging problems in quantum physics and chemistry. However, the advantages of ML over more traditional methods have not been firmly established. We prove that classical ML algorithms can efficiently predict ground state properties of a physical system, after learning from data obtained by measuring related systems. We also prove that classical ML algorithms can efficiently classify a wide range of quantum phases of matter. Our arguments are based on the concept of a classical shadow, a succinct classical description of a quantum state that can be constructed in feasible quantum experiments and be used to predict many properties of the state.

This is joint work with Robert Huang (Caltech), Giacomo Torlai (AWS), Victor Albert (University of Maryland) and John Preskill (Caltech+AWS).

Zoom link: https://jku.zoom.us/j/98246734423?pwd=QXJ4ekJBUkNHb3M0bXFKVEVuQ1h1dz09, opens an external URL in a new window (Meeting-ID: 982 4673 4423, Passwort: 550113)

Name	Purpose	Lifetime	Provider
CookieConsent	This cookie saves your settings about cookie-handling at this website.	1 year	JKU
se_mode	This cookie is used for settings of the site search.	1 year	JKU

Name	Purpose	Lifetime	Provider
_gcl_au	This cookie is used by Google Analytics to understand user interaction with the website.	3 months	Google
_ga	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.	2 years	Google
_gid	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.	1 day	Google
_gat_UA-112203476-1	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.	1 minute	Google
_pk_id	This cookie is used to store a few details about the user such as the unique visitor ID.	13 months	JKU
_pk_ses	This cookie is a short lived cookie used to temporarily store data for the visit.	30 minutes	JKU
_pk_ref	This cookie is used to store the attribution information, the referrer initially used to visit the website.	6 months	JKU

Name	Purpose	Lifetime	Provider
_gcl_au	This cookie is used by Google Analytics to understand user interaction with the website.	3 months	Google
_ga	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.	2 years	Google
_gid	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.	1 day	Google
_gac_UA-112203476-1	Contains campaign related information for the user and measures the AdWords campaign success.	90 days	Google
test_cookie	This cookie is set to determine if the website visitor's browser supports cookies. Doesn't contain personal identifier.	15 minutes	Google
IDE	This cookie carries out information about how the end user uses the website and any advertising that the end user may have seen before visiting the said website.	1 year	Google
_gcl_aw	This cookie is set when a user clicks an ad to reach our website. It informs about the success of campaigns and allows to connect ads to conversion targets.	3 months	Google
AMCV_xx	This is a pattern type cookie name associated with Adobe Marketing Cloud. It stores a unique visitor identifier, and uses an organisation identifier to allow a company to track users across their domains and services.	3 years	LinkedIn
bcookie	Contains a browser ID.	2 years	LinkedIn
bscookie	Contains a browser ID for a secure connection.	2 years	LinkedIn
lang	This cookie is used to store the language preference of our visitors	Session	LinkedIn
lidc	This cookie carries out information about how the end user uses the website and any advertising that the end user may have seen before visiting the said website.	1 day	LinkedIn
lissc	This cookie is used to analyze how a visitor interacts with embedded services.	1 year	LinkedIn
UserMatchHistory	This cookie is set when a user clicks an ad to reach our website. It informs about the success of campaigns and allows to connect ads to conversion targets.	30 days	LinkedIn
fr	This cookie is set when a user clicks an ad to reach our website. It informs about the success of campaigns and allows to connect ads to conversion targets.	90 days	Facebook
fbp	This cookie is used to display advertisings, for example third-party real time offers.	90 days	Facebook
sc_at	This cookie is used to identify a visitor across multiple domains.	1 year	Snap
sc-country	This cookie is used to determine a visitor's country.	1 day	Snap
uid	This cookie sets a random User-ID and helps at real time bidding for display advertising to targeted audiences.	60 days	Adform
C	This cookie identifies if user’s browser accepts cookies. 1 – Cookies are allowed, 3 – Opt-out.	30 days	Adform

LIT AI Lab/ELLIS Unit Linz Seminar.

Thursday, 3 November at 1 pm (Hybrid meeting - Link and Room below) Michael Kamp Ruhr-University Bochum Relative Flatness and Generalization

Wednesday, 29 June 2022 at 10:30 am (Hybrid meeting - Link and Room below) Sibylle Möhle Institute for Formal Models and Verification On Propositional Model Counting and Enumeration

Wednesday, 19 January 2022 at 01:00 pm (Zoom Meeting - Link below) Stefan Perko Research Assistant (PhD student), Friedrich-Schiller-Universität Jena joint work with Stefan Ankirchner Approximating stochastic gradient descent with diffusions

Wednesday, 1 September 2021 at 10am Room S3 055 or via Zoom (link further below) Thomas Koller Hochschule Luzern Informatik Challenging Human Supremacy: Evaluating Monte Carlo Tree Search and Deep Learning for the Trick Taking Card Game Jass

Thursday, 19 August 2021 at 2:30pm Zoom (link below) Alexander Kolesnikov Google Switzerland New Vision Architectures Beyond CNNs

Wednesday, 11 August 2021 at 10am Lecture hall 19 or via Zoom (link further below) Richard Küng Institute for Integrated Circuits, Johannes Kepler University Linz Provably efficient machine learning for quantum physics