This post was written for my client Spectralmind and appeared initially on their blog:
When we demo Spectralmind’s SEARCH by Sound, a similarity search engine for music, we often realize how different the focus is on certain aspects of “similarity” among listeners. The similarity results calculated by the Spectralmind platform appear “similar” to one listener, but are judged as “not similar” by another or “somewhat similar” by a third.
Musical similarity is a very complex area and the reason for the deviations in judgement stems from the fact that similarity has so many dimensions. This raises the question, to which dimension do people relate when asked about the similarity of music?
Personally I observe that people try to exemplify similarity first of all from melody. The particular succession of higher and lower tones that form a melody is clearly a distinctive feature, which allows the listener to determine the degree of likeness or even closeness between two musical works.
Trombone Shorty at the Jazzfest Wien, 2011
But there are other dimensions of similarity as well:
- Timbral similarity: timbre refers to the the tone color of a sound, which varies significantly among the characteristics of the sound-creating device, such as voice, string or wind instruments. As a listener we are able to identify the kinds of instruments playing, even in an ensemble like a band or an orchestra. The same melody played by a piano or a saxophone or a guitar makes a big difference in terms of timbral similarity.
- Rhythmic similarity: rhythm is made up of a repeating pattern of sounds and silences. We perceive rhythm as fast or slow. Through rhythmic beats alone, we can set apart musical genres from each other, like rock from reggae. Music, dance and even spoken language rely on rhythm as a main and defining element. Different rhythms can be put underneath the same melody (which can be highly entertaining or massively disturbing). This practical example of melodic similarity combined with rhythmic dissimilarity highlights the difficulty to assess an overall measure of similarity between two pieces of music.
- Structural similarity: this refers to the occurrence of specific sections within a piece of music. Common sections are intro, verse, chorus (also known as refrain), interlude and outro among many more. These are formal criteria, which can be applied to describe constructive or sequential similarities of e.g. pop music songs or symphonic compositions.
There are many more dimensions of similarity beyond the ones mentioned. Some of them are even inaccessible to human perception, but very perceptible to musical data-mining programs such as the Spectralmind Audio Intelligence Platform.
Similarity decisions need to be judged by the rationale of the similarity search. Sometimes, melodic resemblance is the searched-for attribute. In other cases it might be rhythmic conformity or timbral affinity. Or a mix of multiple qualities. The crucial factor is the intended use of the similar-sounding music. Having this intention in mind helps to escape a possible bias.
We are striving to improve our software in a way that makes its similarity opinion more comprehensible and transparent. Users have a desire to understand which dimensions of similarity the software uses to suggest something as similar.