# How Does Shazam Work? Let's Understand Music Recognition Algorithms Together

in technology •  3 years ago

In my last article I told you about Fourier transform as a way of signals representation in a frequency domain. I also promised to tell you how it's applied in such a wonderful service that is called Shazam, which identifies the song by a short musical excerpt. This app can be downloaded on the iPhone, on Android and other platforms.

Let’s pretend that you are at a concert and there is a lovely song that you don’t know but want to remember – turn Shazam on and then the song title and artist, as well as additional information - lyrics, videos, biography of the artist, concert tickets and recommended tracks would be sent to you. In this article, I won't give any complex mathematical formulas, but will try in to explain music recognition algorithms in a simple language

# How the Fourier transform is connected with Shazam algorithms?

The discrete Fourier transform, which I told you about in a previous article, will help to transform a finite set of signal samples taken at regular intervals of time, into a list of the coefficients of the final combination of complex sinusoids, ordered by frequency. It will help to study the spectrum of the signal and to determine which frequencies exist in this signal and which not. After that, you can filter, amplify or attenuate certain frequencies, or simply recognize the sound of a certain height among the available set of frequencies or get the signature of signals - take "fingerprints", to put it in simple language.

# And now let's go to the technical part of the work of Shazam.

Common steps are:

• Card-index with an imprint of music was created and saved into the
database of Shazam.
• User "notes" the song that he heard on which an imprint is
generated on the basis of a ten-second audio sample.
• The application sends the imprint to Shazam service, which looks for
matches in the database.
• If the matches are found then you will be notified about this and all the information about the track will be displayed.

That's how the imprinting works:

Shazam could see music as a simple graph - spectrogram. On one axis of it there is a time (x-axis), on the other - the frequency(y-axis), the third, vertical line, has got the intensity.

Here is an example of how the song might look:

Shazam algorithm makes an imprint of the song by creating three-dimensional graphics and detecting the frequency of "peak intensity".
Shazam is building its catalog of imprints in the form of a hash table in which the key role is played by the frequency value. Receiving an imprint, Shazam uses different keys to find some similar songs. Their hash table might look like this:

They are looking for a pair of points - "peak intensity" plus a second "reference point." Therefore, their key contains not only a single frequency; it has got frequencies from both points. That leads to fewer collisions (when two different hash key matches) and speeds up the search through a catalog, allowing them to make more use of the average run time.

The top graph: Scatterplot of matching hash locations haven't found a diagonal so the songs are not the same.

The bottom graph: matching frequency observed at one time, so the songs are identical.

If there was not only one match between songs then the time-frequency matching will be checked. A two-dimensional frequency plot on which the match occurred is developed. On one axis there is the time of the appearance of frequency in the track, a similar time for the sample. If among the set of points there is a correlation, points form a diagonal. If such line is found then it is the song that you have searched for and it names will be displayed to you.

So you see that it is not really hard to understand how the Shazam works, but it has got a rather complicated scheme and you must know that this is only a basic algorithm - in fact, Shazam uses the upgraded one and we'll never know it for sure as every developer keeps everything in secret.

Image credit: 1, 2, 3, 4

Alex aka @phenom

Sort Order:
·  3 years ago

wow, thanks for this informative post! It's awesome

·
·  3 years ago

Thanks for the feedback. Stay tuned

·  3 years ago

Literally have always wondered how it works, shot for this.

·
·  3 years ago

Before I realised how this app works it was like a magic for me

·  3 years ago

Wow man, I have always wondered how this worked! Cheers!

·
·  3 years ago

Cheers, man. When I initially used Shazam it was like a magic for me too.

·  3 years ago (edited)

Useful information - especially for man, who have never used Shazam, like me)

·
·  3 years ago

It seems that you're one of a kind)

·
·
·  3 years ago

May be)

·  3 years ago

It's always amazed me how well shazam works. Thanks for the post.

·
·  3 years ago

·  3 years ago

Ah yes, i remember doing pattern recognition w/ MatLab in college , trying do reverse engineer the Shazam algorithm. Good times!

·
·  3 years ago

wow. I work a lot in Matlab. Absolutely amazing tool.
Have you finally reproduced shazam algorithm?

·  3 years ago

This is something I didn't know about at all! Thanks a lot!

·  3 years ago

It's interesting that it only takes a 10-second sample to compare two songs! Thanks for that explanation.