Go to JKU Homepage
Institute of Computational Perception
What's that?

Institutes, schools, other departments, and programs create their own web content and menus.

To help you better navigate the site, see here where you are at the moment.

Whither Music?

Exploring Musical Possibilities via Machine Simulation

Scientific Publications

PERCEIVE AND PREDICT:
Explainable and Robust Perception & Prediction Models

Karystinaios, E., Foscarin, F. and Widmer, G. (2024).
Perception-Inspired Graph Convolution for Music Understanding Tasks.
In Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024), Jeju, Korea. (to appear)

Karystinaios, E., Foscarin, F. and Widmer, G. (2023). [Code, opens an external URL in a new window]
Musical Voice Separation as Link Prediction: Modeling a Musical Perception Task as a Multi-Trajectory Tracking Problem, opens an external URL in a new window.
In Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), Macao, S.A.R..

Foscarin, F., Harasim, D. and Widmer, G. (2023). [Code, opens an external URL in a new window]
Predicting Music Hierarchies with a Graph-Based Neural Decoder, opens an external URL in a new window.
In Proceedings of the 24th Conference of the International Society for Music Information Retrieval (ISMIR 2023), Milan, Italy.

Karystinaios, E. and Widmer, G. (2023). [Code, opens an external URL in a new window]
Roman Numeral Analysis with Graph Neural Networks: Onset-wise Predictions from Note-wise Features, opens an external URL in a new window.
In Proceedings of the 24th Conference of the International Society for Music Information Retrieval (ISMIR 2023), Milan, Italy.

Peter, S. (2023). [Code, opens an external URL in a new window]
Online Symbolic Music Alignment with Offline Reinforcement Learning, opens an external URL in a new window.
In Proceedings of the 24th Conference of the International Society for Music Information Retrieval (ISMIR 2023), Milan, Italy.

Chowdhury, S. (2022).
Modeling Emotional Expression in Music Using Interpretable and Transferable Perceptual Features, opens an external URL in a new window.
PhD Thesis, Johannes Kepler University Linz.

Bjare, M., Lattner, S. and Widmer, G. (2022). [Code, opens an external URL in a new window]
Differentiable Short-Term Models for Efficient Online Learning and Prediction in Monophonic Music, opens an external URL in a new window.
Transactions of the International Society for Music Information Retrieval 5(1), 190–207.

Karystinaios, E. and Widmer, G. (2022). [Code, opens an external URL in a new window]
Cadence Detection in Symbolic Classical Music using Graph Neural Networks, opens an external URL in a new window.
In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Bengaluru, India.

PERFORM AND INTERACT:
Transparent Models of Musical Performance and Interaction

Cancino Chacon, C., Peter, S., Hu, P., Karystinaios, E., Henkel, F., Foscarin, F., Varga, N. and Widmer, G. (2023). [Code, opens an external URL in a new window]
The ACCompanion: Combining Reactivity, Robustness, and Musical Expressivity in an Automatic Piano Accompanist, opens an external URL in a new window
In Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), Macao, China.

Chowdhury, S. and Widmer, G. (2023).
Expressivity-aware Music Performance Retrieval using Mid-level Perceptual Features and Emotion Word Embeddings, opens an external URL in a new window. In Proceedings of Forum of Information Retrieval and Evaluation (FIRE 2023), Goa University, Panjim, India.

Chowdhury, S. and Widmer, G. (2022). [Demo Video, opens an external URL in a new window]
Decoding and Visualising Intended Emotion in an Expressive Piano Performance, opens an external URL in a new window.
Late-Breaking/Demo Papers, 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Bengaluru, India.

Cancino Chacon, C., Peter, S. and Widmer, G. (2022).
Can We Achieve Togetherness with an Artificial Partner? Insights and Challenges from Developing an Automatic Accompaniment System., opens an external URL in a new window (Extended Abstract)
In Musical Togetherness Symposium (MTS-22), Vienna, Austria.

GENERATE:
Controllable Music Generation Models

Plasser, M., Peter, S., and Widmer, G. (2023). [Code, opens an external URL in a new window]
Discrete Diffusion Probabilistic Models for Symbolic Music Generation, opens an external URL in a new window.
In Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), Macao, China.

Bjare, M., Lattner, S. and Widmer, G. (2023).
Exploring Sampling Techniques for Generating Melodies with a Transformer Language Model, opens an external URL in a new window.
In Proceedings of the 24th Conference of the International Society for Music Information Retrieval (ISMIR 2023), Milan, Italy.

Plasser, M. (2023).
Symbolic Music Generation Using Discrete Diffusion Probabilistic Models, opens an external URL in a new window.
Master Thesis, Inst. of Computational Perception, Johannes Kepler University Linz, Austria.

Hausberger, A., Pichler, C., Pilkov, I., Wögerbauer, J., Cancino-Chacón, C. and Peter, S. (2022).
"On a Journey Together", opens an external URL in a new window: An interactively composed song, and our student team's submission to the AI Song Contest 2022, opens an external URL in a new window.

EXPERIMENT AND SIMULATE:
Musical Studies, Experiments, Critical Reflections

Martak, L., Hu, P., and Widmer, G. (2024).
Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems.
In International Workshop on Sound Signal Processing Applications (IWSSPA 2024), Costa Ballena, Spain.

Peter, S., Chowdhury, S., Cancino-Chacón, C., and Widmer, G. (2023).
Are we describing the same sound? An Analysis of Word Embedding Spaces of Expressive Piano Performance, opens an external URL in a new window. In Proceedings of Forum of Information Retrieval and Evaluation (FIRE 2023), Goa University, Panjim, India.

Peter, S., Cancino-Chacón, C., Karystinaios, E. and Widmer, G. (2023).
Sounding Out Reconstruction Error-Based Evaluation of Generative Models of Expressive Performance, opens an external URL in a new window. In Proceedings of the 10th International Conference on Digital Libraries for Musicology (DLfM 2023), Milan, Italy.

Nikrang, A., Grachten, M., Gasser, M., Frostel, H., Widmer, G., and Collins, T. (2023).
Music Visualisation and its Short-term Effect on Appraisal Skills, opens an external URL in a new window.
Late Breaking Papers, 25th International Conference on Human-Computer Interaction (HCII 2023), Copenhagen, Denmark. Springer Verlag.

Cancino-Chacón, C. (2023).
Commentary on “A Computational Approach to the Detection and Prediction of (Ir)Regularity in Children’s Folk Songs”, opens an external URL in a new window.
Empirical Musicology Review 16(2), pp. 328-335. (published Mar 2023)

FUNDAMENTAL TECHNOLOGIES:
Representation, Explanation, Control, Robustness, Complexity Reduction

EXPLAINABILITY AND ROBUSTNESS:

Karystinaios, E., Foscarin, F. and Widmer, G. (2024).
SMUG-Explain: A Framework for Symbolic Music Graph Explanations.
In Proceedings of the Sound and Music Computing Conference (SMC 2024), Porto, Portugal. (to appear)

Foscarin, F., Hoedt, K., Praher, V., Flexer, A. and Widmer, G. (2022). [Code, opens an external URL in a new window]
Concept-Based Techniques for "Musicologist-Friendly" Explanations in a Deep Music Classifier, opens an external URL in a new window.
In Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Bengaluru, India.

Hoedt, K., Praher, V., Flexer, A. and Widmer, G. (2022). [Code, opens an external URL in a new window]
Constructing Adversarial Examples to Investigate the Plausibility of Explanations in Deep Audio and Image Classifiers, opens an external URL in a new window.
Neural Computing and Applications (2022). DOI: 10.1007/s00521-022-07918-7, opens an external URL in a new window

Prinz, K., Flexer, A. and Widmer, G. (2022). [Code, opens an external URL in a new window]
Defending a Music Recommender Against Hubness-Based Adversarial Attacks, opens an external URL in a new window.
In Proceedings of the 2022 Sound and Music Computing (SMC 2022), Saint-Etienne, France.

Martak, L.S., Kelz, R. and Widmer, G. (2022). [Code, opens an external URL in a new window]
Balancing Bias and Performance in Polyphonic Piano Transcription Systems, opens an external URL in a new window.
Frontiers in Signal Processing 2 (2022), DOI:10.3389/frsip.2022.975932  

Marták, L., Kelz, R. and Widmer, G. (2022).
Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-linear Modelling for Audio Source Separation, opens an external URL in a new window.
In Proceedings of the 24th International Congress on Acoustics (ICA 2022), Gyeongju, Korea.

 

EFFECTIVE REPRESENTATIONS FOR SOUND AND MUSIC:

Karystinaios, E., Foscarin, F., Jacquemard, F., Sakai, M., Tojo, S. and Widmer, G. (2023).
8+8=4: Formalizing Time Units to Handle Symbolic Music Durations, opens an external URL in a new window.
In Proceedings of the 16th International Symposium on Computer Music Multidisciplinary Research (CMMR 2023), Tokyo, Japan.

Zhang, H, Karystinaios, E., Dixon, S., Widmer, G., and Cancino-Chacón, C. (2023). [Code, opens an external URL in a new window]
Symbolic Music Representations for Classification Tasks: A Systematic Evaluation., opens an external URL in a new window
In Proceedings of the 24th Conference of the International Society for Music Information Retrieval (ISMIR 2023), Milan, Italy.

Koutini, K., Masoudian, S., Schmid, F., Eghbal-zadeh, H., Schlüter, J. and Widmer, G. (2022). [Code, opens an external URL in a new window]
Learning General Audio Representations with Large-Scale Training of Patchout Audio Transformers, opens an external URL in a new window.
In HEAR: Holistic Evaluation of Audio Representations (Proceedings of Machine Learning Research, PMLR, 166:65–89).


SIMPLICITY AND EFFICIENCY:

Schmid, F., Koutini, K. and Widmer, G. (2024).
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models, opens an external URL in a new window.
IEEE/ACM Transactions on Audio, Speech, and Language Processing 32, 2227 - 2241.
DOI: 10.1109/TASLP.2024.3376984, opens an external URL in a new window

Schmid, F., Koutini, K. and Widmer (2023). [Code, opens an external URL in a new window]
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation, opens an external URL in a new window.
In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodos, Greece.

 

MULTI-MODALITY:

Carvalho, L. and Widmer, G. (2023).
Passage Summarization with Recurrent Models for Robust Audio – Sheet Music Retrieva, opens an external URL in a new windowl.
In Proceedings of the 24th Conference of the International Society for Music Information Retrieval (ISMIR 2023), Milan, Italy.

Carvalho, L. and Widmer, G. (2023).
Towards Robust and Truly Large-Scale Audio--Sheet Music Retrieval, opens an external URL in a new window.
In Proceedings of the IEEE 6th International Conference on Multimedia Information Processing and Retrieval (MIPR 2023), Singapore.

Carvalho, L., Washüttl, T., and Widmer, G. (2023). [Code to come]
Self-supervised Constrastive Learning for Robust Audio - Sheet Music Retrieval Systems, opens an external URL in a new window.
In Proceedings of the  ACM Multimedia Systems Conference 2023 (MMSys’23), Vancouver, Canada.
DOI: 10.1145/3587819.3590968, opens an external URL in a new window

Henkel, F. (2022). [Code, opens an external URL in a new window]
Multi-modal Deep Learning for On-line Music Following in Score Sheet Images, opens an external URL in a new window.
PhD Thesis, Johannes Kepler University Linz.

RESEARCH RESOURCES:
Open Source Datasets and Software Tools

Cancino-Chacón, C. and Pilkov, I. (2024). [Data, opens an external URL in a new window]
The Rach3 Dataset: Towards Data-Driven Analysis of Piano Performance Rehearsal, opens an external URL in a new window.
In Proceedings of the 30th Conference on Multimedia Modeling (MMM24), Amsterdam , The Netherlands.

Peter, S. D., Cancino-Chacón, C., Foscarin, F., McLeod, A., Henkel, F., Karystinaios, E., and Widmer, G. (2023). [Data, opens an external URL in a new window | Code, opens an external URL in a new window | Web Tool, opens an external URL in a new window]
Automatic Note-Level Score-to-Performance Alignments in the ASAP Dataset, opens an external URL in a new window.
Transactions of the International Society for Music Information Retrieval 6(1), 27–42.

Hu, P. and Widmer, G. (2023). [Data, opens an external URL in a new window]
The Batik-plays-Mozart Corpus: Linking Performance to Score to Musicological Annotations, opens an external URL in a new window.
In Proceedings of the 24th Conference of the International Society for Music Information Retrieval (ISMIR 2023), Milan, Italy.

Cancino-Chacón, C., Peter, S., Karystinaios, E., Foscarin, F., Grachten, M. and Widmer, G. (2022). [Code, opens an external URL in a new window]
Partitura: A Python Package for Symbolic Music Processing, opens an external URL in a new window.
In Proceedings of the Music Encoding Conference (MEC 2022), Halifax, Canada.

Foscarin, F., Karystinaios, E., Peter, S., Cancino-Chacón, C., Grachten, M. and Widmer, G. (2022). [Doc, opens an external URL in a new window | Data, opens an external URL in a new window]
The match File Format: Encoding Alignments between Scores and Performances, opens an external URL in a new window.
In Proceedings of the Music Encoding Conference (MEC 2022), Halifax, Canada.

Acknowledgment

This project receives funding from the European Research Council (ERC), opens an external URL in a new window under the European Union's Horizon 2020 research and innovation programme under grant agreement No 101019375 (Whither Music?).