Khaled Koutini

Bild von Khaled Koutini
Studentischer Mitarbeiter

My PhD project

Learning general-purpose audio representations with deep neural networks

Audio classification and tagging are central tasks in the field of 
machine listening. They are essential for machines to recognize the 
environment and identify events in their surroundings, and thus a 
critical component of machine perception. These tasks are relevant in a 
wide range of applications, including content-based multimedia 
information retrieval, context-aware smart devices, and monitoring 
systems. One significant barrier to machine audio recognition is the 
high cost and scarcity of high-quality labeled data, particularly for 
powerful learners such as deep neural networks. We investigate the 
incorporation of inductive biases into neural network architectures and 
the training process in order to improve generalization when training on 
small audio datasets. We also investigate how to extract representations 
from models trained on large-scale, general-purpose datasets, to be 
transferred to specialized tasks.

    * *Supervisor: *Gerhard Widmer, JKU Linz, Austria

    * *Dates:* 01 Jan 2020 – Ongoing