Demixing Process

Separating the individual components of an existing mixture of sounds is difficult. Here, we propose an iterative approach that exploits a note event-based multiple fundamental frequency detection framework and spectral filters.


Step 1 - Input Signal

Imagine that we have a mixture of sounds, a recording that contains several music instruments playing at the same time. Which instruments and how many of them are playing simultaneously? Initially we do not know these details, and we also ignore the pitches of the notes being played, but the idea is to find out most of this information directly from the mixture.

The figure on the right shows the input signal in the time domain and its spectrogram, generated using a frame size of 2048 samples with 50% overlap. How does this mixture sound like? Find that out by using the controls below. Can you spot the number of instruments?


Step 2 - Finding Note Events

Using the proposed iterative estimation/separation method it is possible to look inside the audio mixture and find note events, which can be formed by a single musical note or by several musical notes with similar pitches. The output works like the score of the music (piano roll), and it tells the notes being played at any particular time. On the right, the output for the piece of music we're analysing.

The colour lines on the graph are known as pitch contours and they are now telling us that 11 note events were found (each colour is one note event), and looking carefully to their shapes, we can say that the original mixture might consist of 17 musical notes. This is information we did't have before!


Step 3 - Separation of Note Events

Knowing the pitch contours, we can design a set of spectral filters to separate the energy of each note event from within the original mixture. The result is a new set of tracks, each containing the musical notes asociated with each particular note event.


Step 4 - Clustering Note Events

Listening to the separated note events it is possible to identify three instruments playing (violin, clarinet and saxophone), and we can now cluster those note events belonging to the same instrument, in order to form the tracks for the separated sources.


Step 5 - MIDI File

Now that we know the pitches and timings of the notes inside the mixture, many interesting things can be done. For instance, we can now convert the pitch contours into a MIDI file, so that the music can be played by a different set of instruments. The graph on the right shows an equivalent piano-roll representation of the mixture, and the example below, presents a synthezised organ playing the same music.


Step 6 - Writing the Score

The MIDI file already created can be exported to an automatic software for music composition and notation, in order to generate an approximation to the score of the original music excerpt. In this example we used MuseScore.