Towards Expressivity-aware Computer Systems in Music

Project Summary

What makes music so important, what can make a performance so special and stirring? It is the things the music expresses, the emotions it induces, the associations it evokes, the drama and characters it portrays. The sources of this expressivity are manifold: the music itself, its structure, orchestration, personal associations, social settings, but also -- and very importantly -- the act of performance, the interpretation and expressive intentions made explicit by the musicians through nuances in timing, dynamics etc. 

Thanks to research in fields like Music Information Research (MIR), computers can do many useful things with music, from beat and rhythm detection to song identification and tracking. However, they are still far from grasping the essence of music: they cannot tell whether a performance expresses playfulness or ennui, solemnity or gaiety, determination or uncertainty; they cannot produce music with a desired expressive quality; they cannot interact with human musicians in a truly musical way, recognising and responding to the expressive intentions implied in their playing. 

The project is about developing machines that are aware of certain dimensions of expressivity, specifically in the domain of (classical) music, where expressivity is both essential and -- at least as far as it relates to the act of performance -- can be traced back to well-defined and measurable parametric dimensions (such as timing, dynamics, articulation). We will develop systems that can recognise, characterise, search music by expressive aspects, generate, modify, and react to expressive qualities in music. To do so, we will (1) bring together the fields of AI, Machine Learning, Music Information Retrieval (MIR), and Music Performance Research; (2) integrate theories from Musicology to build more well-founded models of music understanding; (3) support model learning and validation with massive musical corpora of a size and quality unprecedented in computational music research.

In terms of computational methodologies, we will rely on, and improve, methods from Artificial Intelligence - particularly: probabilistic models (for information fusion, tracking, reasoning and prediction); machine learning - particularly: deep learning techniques (for learning musical features, abstractions, and representations from musical corpora, and for inducing mappings for expression recognition and prediction); audio signal processing and pattern recognition (for extracting musical parameters and patterns relevant to expressivity); and information theory (for modelling musical expectation, surprise, uncertainty, etc.). This will be combined with high-level concepts and models of structure perception from fields like systematic and cognitive musicology, in order to create systems that have a somewhat deeper 'understanding' of music, musical structure, music performance, and musical listening, and the interplay of these factors in making music the expressive and rewarding art that it is. (A more detailed discussion of how we believe all these things relate to each other can be found in the "Con Espressione Manifesto").

With this research, we hope to contribute to a new generation of MIR systems that can support musical services and interactions at a new level of quality, and to inspire expressivity-centered research in other domains of the arts and human-computer interaction (HCI).

Project Details

Call identifier


Project Number


Principal Investigator

Gerhard Widmer

Project Period

Jan 2016 - Dec 2021

Funding Amount

€ 2,318,750.00

The Con Espressione Game ...

Play and contribute: this is what the project is about:

Do you have a bit of time to listen to some bits of music (and maybe contribute some empirical data to the project)? This is what the project is about:

Play The Con Espressione Game

... of, if you have very little time:

The Con Espressione Manifesto

Our Guiding Strategic Document

Here is our view (2016) on the current research landscape, and what research needs to be done in the coming years (within and beyond Con Espressione):

(If you are interested in applying for a research position in the project, please think about how your research ideas and plans would fit in this scheme (or go beyond it, because we may have missed some crucial directions ...). 





  • 2017-10-27:
    Demonstration of first prototype of our expressive accompaniment system The Accompanion v0.1 at the Late Breaking / Demo Papers Session at the ISMIR 2017 Conference, Suzhou, China.
    Demo Videos on a Bösendorfer CEUS: Mozart Sonata K.545, 2nd mvt. (Werner Goebl); "The Wild Geese" (Gerhard Widmer)

  • 2017-10-14:
    Gerhard Widmer as studio guest in the main evening news of Slovenian public TV station RTV Slovenia (via RTV Slo Archive)
  • 2017-10-14:
    Gerhard Widmer to give keynote lecture at Slovenian Conference on Artificial Intelligence, Ljubljana
  • 2017-04-13:
    Gerhard Widmer to talk at the Karajan Music Tech Conference as part of the 50th Easter Festival (Osterfestspiele), Salzburg.

  • 2017-01-13:
    Our BasisMixer computational model of expressive music performance is said to have passed a musical "Turing Test" (producing a piano performance whose "humanness" as judged by listeners is undistinguishable from a human musician's [and ranking best, in this respect, among a number of algorithms]) in a recent study ("Algorithms Can Mimic Human Piano Performance") by E. Schubert et al., published in J.New.Mus.Res. (Jan. 2017)



Project Team Members (Past and Present)

Anna Aljanaki

Andreas Arzt

Stefan Balke

Carlos Eduardo Cancino Chacon

Shreyan Chowdhury

Amaury Durand

Martin Gasser

Harald Frostel

Thassilo Gadermaier

Maarten Grachten

Florian Henkel

Rainer Kelz

Filip Korzeniowski

Stefan Lattner

Nastaran Okati

Silvan Peter

David Sears

Federico Simonetta

Andreu Vall

Gerhard Widmer

Other Collaborators / Contributors

Matthias Dorfer

Ali Nikrang

Publications, Presentations, and Media Coverage

Scientific Publications

Want to know more about the scientific work and results of the project?

Here's an up-to-date list of our scientific publications related to the project.

Public Presentations

Media Coverage

Demonstrators and Prototypes

Man-Machine Collaboration in Expressive Performance:
The "Con Espressione!" Exhibit

The Con Espressione! Exhibit is an interactive system designed for popular science exhibitions. It demonstrates and enables joint human-computer control of expressive performance: the visitor controls overall tempo and loudness of classical piano pieces (such as Beethoven's "Moonlight" sonata Op.27 No.2) via hand movements (tracked by a LeapMotion sensor). In the background, our "Basis Mixer" expressive performance model adds subtle modifications to the performance, such as articulation and micro-timing (e.g., slight temporal differences in the note onsets when playing chords). The contribution of the Basis Mixer can be controlled and experimented with via a slider.

The exhibit was first shown at the Science Exhibition ("The Mathematics of Music") in Heidelberg, Germany (May - Dec. 2019). The source code is openly available via a github repository.

A video showcasing the Bösendorfer CEUS computer-monitored grand piano will come soon. Here is a first preview video (June 2019) ...

Autonomous Expressive Accompaniment:
The "ACCompanion"

The ACCompanion (work in progress) is an automatic accompaniment system designed to accompany a pianist in music for two pianists (or two pianos). The ACCompanion will not just follow the soloist in real time and synchronise with her playing, but will also recognise and anticipate expressive intentions and playing style of the soloist, and contribute its own expressive interpretation of the accompaniment part (via the "BasisMixer" expressive performance model). 

2017: Demonstration of first, very preliminary prototype Accompanion v0.1 (currently limited to a monophonic solo part) at the ISMIR 2017 Conference, Suzhou, China.
Demo Videos on a Bösendorfer CEUS concert grand: Mozart Sonata K.545, 2nd mvt. (pianist: Werner Goebl); "The Wild Geese" (piano: Gerhard Widmer)

2019: to come: demo video with full polyphonic music.

A Generative Model of Expressive Piano Performance:
The "Basis Mixer"

The Basis Mixer is a comprehensive computational model of expressive music performance that predicts musically appropriate patterns for various performance parameters (tempo, timing, dynamics, articulation, ...) as a function of the score of a given piece. It is based on so-called basis functions (feature functions that describe various relevant aspects of the score) and state-of-the-art deep learning methods. A comprehensive description can be found in Carlos Cancino's Ph.D. thesis (Dec. 2018). The model has been used as an experimental tool for studying or verifying various hypotheses related to expressive piano performance, and can also be used to generate expressive performances for new pieces. An early version of the model is said to have passed a "musical Turing Test", producing a piano performance whose "humanness" as judged by listeners was undistinguishable from a human musician's [and ranking best, in this respect, among a number of algorithms]) in a recent study (E. Schubert et al., "Algorithms Can Mimic Human Piano Performance: The Deep Blues of Music",  E. Schubert et al., J.New.Mus.Res. 2017)

The Basis Mixer is used as an autonomous expressive performance generator in several of our demonstrators (e.g., the ACCompanion and the Con Espressione! Exhibit).

Here is a web-based tool for experimenting with the model.
(Note [May 2019]: this is outdated; a new version with a better trained model and a new interface will come soon ...)..

"Quintuplets Over a Wonky Four on the Floor":
Generation of Expressive Rhythms in the Decoupled Oscillator Sequencer

The Decoupled Oscillator Sequencer is an experimental implementation of a tool for generating complex rhythms with expressive micro-timing. It is based on several individual, partly dependent clocks realised as oscillators that can influence each others' periodicities by virtue of being connected in a network.

Play with our interactive Web Prototype.

The technical background is briefly decribed in a demo paper (Sound & Music Computing (SMC) Conference 2019, Malaga, Spain).

Automatic Sound and Music Recognition:
The "Listening Machine" at the Ars Electronica Center

The Listening Machine is an interactive exhibit designed for the Ars Electronica Center (AEC), to demonstrate real-time computational sound/music perception to the general public. It is based on a deep neural network that has been trained, via machine learning methods and using thousands of sound examples, to recognise different kinds of sounds, by finding out what patterns in the sound signal are characteristic of certain classes – for example, what distinguishes a flute from a trumpet, or spoken language from singing.

To come: video documentary produced by AEC on the occasion of the opening of the new AEC permanent exhibition (May 2019).

Rhythm Recognition and Tempo Tracking for Automatic Accompaniment:
Our "robod" Robo-Drummer

The robod is our little robot drummer that continually listens to its surroundings through a microphone, recognises when music is played, automatically determines the meter and downbeat, and accompanies the musicians in real time, adapting to expressive changes of tempo.
It was designed on the occasion of the BE OPEN Public Science Festival in the city center of Vienna (Sept. 2018), organised by the Austrian Science Fund (FWF) on the occasion of its 50th anniversary.

To come (Aug. 2019): Demonstration video.



This project receives funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme under grant agreement No 670035.

In addition, we gratefully acknowledge material support for this research (in the form of direct financial support, music, scores, access to musical instruments and performance spaces) from the following institutions:

The Bösendorfer Piano Company, Vienna, Austria

The Royal Concertgebouw Orchestra (RCO), Amsterdam, The Netherlands