Experimenting with Violins
Detecting pitch contours in audio mixtures gets more and more difficult as the polyphony increases. Even human beings have problems distinguishing different notes or recognising instruments in highly complicated audio mixtures. Let's have a look at the following experiment. We recorded some violins playing a different note at the same time and your task is to guess how many of them are present. Use the controls below to listen to the mixture WARNING: It actually sounds awful!
Can you tell how many violins are playing?
The Classic Approach
One way to find out the number of violins consists of running a joint multipitch analysis on the original mixture. In this example we used Duan's algorithm to generate a set of pitch estimates, which are presented in the graph on the right.
Observing the estimates we notice that the trajectories shown are not completely clear. It looks like five violins are playing continuous notes in between 600 Hz and 1200 Hz, but if we also count the extra inclomplete lines outside this range, they can be eight or even nine. The problem is that some of these estimates are outliers (octave and suboctave errors).

An Iterative Note Event-based Approach
If we now use the iterative approach described in the previous page, we can estimate a set of note events by detecting and extracting the predominant one in every iteration.
After ten iterations, nine events were detected and their pitch trajectories are presented in the figure on the right. It is now clear that seven violins are playing simultaneously, with pitches from about 700 Hz up to 1530 Hz. These notes represent 99.76% of the energy of the original mixture, which ensures no other note has been missed by the system.

Due to the complexity of the original mixture in the frequency domain, two of the violin notes are detected as multiple events (Events 2-8 and 3-9). However, observing their shapes and relative positions, it is clear that they belong to the same two notes. Comparing the trajectories in this example with the ground truth ones, the reported accuracy (F-Score) reached 98.6%.