How Shazam Works

There is a cool service called Shazam, which take a short sample of music, and identifies the song.  There are couple ways to use it, but one of the more convenient is to install their free app onto an iPhone.  Just hit the “tag now” button, hold the phone’s mic up to a speaker, and it will usually identify the song and provide artist information, as well as a link to purchase the album.

What is so remarkable about the service, is that it works on very obscure songs and will do so even with extraneous background noise.  I’ve gotten it to work sitting down in a crowded coffee shop and pizzeria.

So I was curious how it worked, and luckily there is a paper written by one of the developers explaining just that.  Of course they leave out some of the details, but the basic idea is exactly what you would expect:  it relies on fingerprinting music based on the spectrogram.

Here are the basic steps:

1. Beforehand, Shazam fingerprints a comprehensive catalog of music, and stores the fingerprints in a database.
2. A user “tags” a song they hear, which fingerprints a 10 second sample of audio.
3. The Shazam app uploads the fingerprint to Shazam’s service, which runs a search for a matching fingerprint in their database.
4. If a match is found, the song info is returned to the user, otherwise an error is returned.

Here’s how the fingerprinting works:

You can think of any piece of music as a time-frequency graph called a spectrogram.  On one axis is time, on another is frequency, and on the 3rd is intensity.  Each point on the graph represents the intensity of a given frequency at a specific point in time. Assuming time is on the x-axis and frequency is on the y-axis, a horizontal line would represent a continuous pure tone and a vertical line would represent an instantaneous burst of white noise.  Here’s one example of how a song might look:

shazam-spectrogram

Spectrogram of a song sample with peak intensities marked in red. Wang, Avery Li-Chun. An Industrial-Strength Audio Search Algorithm. Shazam Entertainment, 2003. Fig. 1A,B.

The Shazam algorithm fingerprints a song by generating this 3d graph, and identifying frequencies of “peak intensity.”  For each of these peak points it keeps track of the frequency and the amount of time from the beginning of the track.  Based on the paper’s examples, I’m guessing they find about 3 of these points per second. [Update: A commenter below notes that in his own implementation he needed more like 30 points/sec.]  So an example of a fingerprint  for a 10 seconds sample might be:

Frequency in Hz Time in seconds
823.44 1.054
1892.31 1.321
712.84 1.703
. . . . . .
819.71 9.943

Shazam builds their fingerprint catalog out as a hash table, where the key is the frequency.  When Shazam receives a fingerprint like the one above, it uses the first key (in this case 823.44), and it searches for all matching songs.  Their hash table might look like the following:

Frequency in Hz Time in seconds, song information
823.43 53.352, “Song A” by Artist 1
823.44 34.678, “Song B” by Artist 2
823.45 108.65, “Song C’ by Artist 3
. . . . . .
1892.31 34.945, “Song B” by Artist 2

[Some extra detail: They do not just mark a single point in the spectrogram, rather they mark a pair of points: the "peak intensity" plus a second "anchor point".  So their key is not just a single frequency, it is a hash of the frequencies of both points.  This leads to less hash collisions which in turn speeds up catalog searching by several orders of magnitude by allowing them to take greater advantage of the table's constant (O(1)) look-up time.  There's many interesting things to say about hashing, but I'm not going to go into them here, so just read around the links in this paragraph if you're interested.]

shazam-plot

Top graph: Songs and sample have many frequency matches, but they do not align in time, so there is no match. Bottom Graph: frequency matches occur at the same time, so the song and sample are a match. Wang, Avery Li-Chun. An Industrial-Strength Audio Search Algorithm. Shazam Entertainment, 2003. Fig. 2B.

If a specific song is hit multiple times (based on examples in the paper I think it needs about 1 frequency hit per second), it then checks to see if these frequencies correspond in time.  They actually have a clever way of doing this  They create a 2d plot of frequency hits, on one axis is the time from the beginning of the track those frequencies appear in the song, on the other axis is the time those frequencies appear in the sample.  If there is a temporal relation between the sets of points, then the points will align along a diagonal.  They use another signal processing method to find this line, and if it exists with some certainty, then they label the song a match.

Add to DeliciousAdd to DiggAdd to FaceBookAdd to Google BookmarkAdd to MySpaceAdd to NewsvineAdd to RedditAdd to StumbleUponAdd to TechnoratiAdd to Twitter

Tags: , , , , , , ,

83 Responses to “How Shazam Works”

  1. Riki Says:

    Very very well explained !

    Do you actually know any code (audacity may be ?) to obtain a time-frequency graph from a common audio file (wav / mp3 … any other) ??

  2. laplacian Says:

    Thanks for the kind words!

    Yes, Audacity does provide spectrograms. From the audacity 1.2 reference: “You can view any audio track as a Spectrogram instead of a Waveform by selecting one of the Spectral views from the Track Pop-Down Menu.”

    Now finding the code to do this within their source might be tricky, so I would recommend another project. The key to this code is in the Fourier Transform, which is usually done by the FFT algorithm ( http://en.wikipedia.org/wiki/Fast_Fourier_transform ).

    Whenever I’m looking for source code for something like this I usually try http://www.sourceforge.net. Just try a search for FFT and you should find some good stuff.

  3. Ralphy Says:

    Thanks for the nice description – sure makes the paper more readable…

    Do you happen know how the “peak intensity” points are found?

    I understand they have a limit on the number of constellation points based on some requirements.

    However, I didn’t understand how to select the “right” candidates from all the local maximum points in the spectrogram (how do I choose a constellation point? how do I choose an anchor point?)

  4. laplacian Says:

    Ralphy, they do not go into detail in what exactly it means to be a peak point. but they give some clues. Here’s the relevant paragraph from section 2.1 of the paper:

    “The peaks are chosen using a criterion to ensure that the density of chosen local peaks is within certain desired bounds so that the time-frequency strip for the audio file has reasonably uniform coverage. The peaks in each time-frequency locality are also chosen according amplitude, with the justification that the highest amplitude peaks are most likely to survive the distortions listed above.”

    So I’m guessing they might do something such as grabbing a small area of the spectrogram and picking the highest 1 or 2 amplitude points, then continuing to the next section. This heuristic would meet their requirement of a uniform distribution while finding local amplitude maximas.

    As far as the anchor points go, I got the impression that it really didn’t matter, as long as it would be a unique and reproducible pairing of points. I believe it was more for hashing performance benefits than for anything else.

  5. laplacian Says:

    Actually that idea won’t work because there is no way way to correlate sections on the time dimension for different samples. Any other ideas?

  6. lf Says:

    How do the frequency, and peak points match at all, when the frequencies vary so greatly based on ambience, and sound quality? The low frequencies must definitely be lost.

  7. What I'm Reading Says:

    [...] OK, but how does Shazam make these fingerprints? As Avery Wang, Shazam’s chief scientist and one of its co-founders, explained to Scientific American in 2003, the company’s approach was long considered computationally impractical—there was thought to be too much information in a song to compile a simple signature. But as he wrestled with the problem, Wang had a brilliant idea: What if he ignored nearly everything in a song and focused instead on just a few relatively “intense” moments? Thus Shazam creates a spectrogram for each song in its database—a graph that plots three dimensions of music: frequency vs. amplitude vs. time. The algorithm then picks out just those points that represent the peaks of the graph—notes that contain “higher energy content” than all the other notes around it, as Wang explained in an academic paper he published to describe how Shazam works (PDF). In practice, this seems to work out to about three data points per second per song. [...]

  8. Dan Ellis Says:

    For those interested in more details, I made my own implementation based on the paper, using the Matlab signal processing environment:

    http://labrosa.ee.columbia.edu/~dpwe/resources/matlab/fingerprint/

    If you can read Matlab source, you can see how it works. I played around with it a fair bit to try to get noise robustness similar to the Shazam app. I ended up needing a lot more than 3 landmarks/sec – more like 30.

    • Bryan Jacobs Says:

      This is very cool, I’ll have to check out your code. The 3 points/second number was extrapolated from the graphs of the examples in their paper, but it might not be accurate for real-world use cases.

  9. How Shazam Works « Finding Delta Says:

    [...] now I know, thanks to an article in Slate that linked to a explanation of the original paper describing the method. The key is in compressing the database of 2M tunes [...]

  10. Aby Says:

    Too good article…very well explained. Thanks for sharing the information !!

  11. Carlo Hamalainen Says:

    It’s pretty easy to get a spectrogram of a wav file using Python using matplotlib:

    http://www.yhvh.co.uk/blog/spectrogram-python

  12. Shazam’s magic revealed! « Deepak Lalan Says:

    [...] http://laplacian.wordpress.com/2009/01/10/how-shazam-works/ [...]

  13. aubio Says:

    Playing with Shazam fingerprints…

    [...] A few weeks ago, I took some time to read the paper and implement [...]…

  14. Playing with Shazam fingerprints :: aubio Says:

    [...] http://aubio.org/news/20091111-2339_shazam [...]

  15. iPhone App Listens to Music and tells you the name of Song « EVOL.reverse Says:

    [...] http://laplacian.wordpress.com/2009/01/10/how-shazam-works/ [...]

  16. Mike Says:

    This is so much easier to understand than Wang’s paper :)

    Thanks.

    I’m still amazed by what Shazam is able to do.

    The question that’s still lurking in my head is – how does the program know how far into a song it is? I’d think that people usually start Shazaming a song after it has started playing (oftentimes, a few minutes into the song). How can the program match all the peak intensities so fast? And they must need a huge server/powerful computers to process all these inquiries going in eh?

    • Bryan Jacobs Says:

      The program knows how far it is into the song because the time for each point is stored in the hash table. It looks up that time in O(1) which is why it is so fast. It does not do a linear search which you seem to be implying which is O(n) and much slower. Definitely read more about hashtables and you will understand why they are so powerful. Yes, they probably have large servers to handle the bandwidth and potentially tens of thousands of concurrent matching requests. The good thing is that the matching requests are read-only, so that makes the problem much easier.

      • Mike Says:

        Thanks Bryan :)

        one more question – how do they deal with the “legal” issues? Did they actually buy all the songs stored in their massive database? (just out of curiosity)

      • Kadir Says:

        Shouldnt be a problem, because they do not store the music itself, only the data to recognize it.

  17. Alphonse Hà Says:

    Very cool… but for the ones who are not very technical…

    How does Shazaam work in plain english?

    :(

  18. Joe M Says:

    may i suggest an interesting reference:

    http://www.birds.cornell.edu/brp/pdf-documents/AppB-SpectrumAnalysis.pdf

    they also have some programs on their website here:

    http://www.birds.cornell.edu/brp/software

    the one called raven gives the voice prints shown in the pdf above and these are the 3rd type of chart mentioned here early on in the description.

    have fun, im not an expert at all but have years of electronic expereince some of it related and have worked with a bird person/scientist how studies and publishes.

    very interesting field and there is certainly a relationship here with shazam.

  19. Shazam Magic « Hillel’s Weblog Says:

    [...] Quick google search reviled the academic paper written by someone from Shazam and a good explanation by this blogger. Very impressive [...]

  20. Dave Says:

    I think it’s al magic.

  21. david Says:

    My friend and I tried Shazam on a same song with iPhone and Android phone. Both phone apps recognized the sound and returned the same song info, but the jacket images they received from Shazam were different. How come?

    • Steve Says:

      You don’t say what the song is but there’s a number of reasons. The first one is that the Shazam App will return the first album that the track is available on in the iTunes store. The Android app doesn’t have access to the iTMS so you might get a different album. Second, there are a small number of artists who have different album covers on the iTMS to what you buy in the shop. And third, it is possible that Shazam’s info is wrong. It’s very rare but it does happen. It usually typos in track names.

  22. Come funziona Shazam (e cloni vari come Midomi): spiegazione tecnica | A thousand words Says:

    [...] fermo qui. Se siete interessati, vi rimando a questo articolo in inglese (da cui ho preso tutte le info qui sopra) per ulteriori [...]

  23. Creating ‘Shazam’ in Java - Redcode Says:

    [...] A couple of days ago I encountered this article: How Shazam Works [...]

  24. Roy van Rijn Says:

    I’ve implemented this algorithm (in a simple form) in Java and documented it a bit on my website:

    http://www.redcode.nl/blog/2010/06/creating-shazam-in-java/

    I’m pretty pleased with the results, and the way my matching handles offsets in the song.

    Because I save the time-in-song for each hashpoint, and the time-in-fragment for the recorded fragment, I can calculate the offset of the hash-match in the song. If I have multiple matches with the same offset I’ve found the correct song.

    The matching is as easy as, for all hashpoints:
    – Find hash matches
    – Substract time-in-song from time-in-fragment = offset
    – Count, add one to song/offset combination

    The song with the highest song/offset combination is our prediction!

    • Bryan Jacobs Says:

      Wow this looks very cool. I like your clean code in the samples too. Would be a great open source project!

  25. Alex Says:

    I worked for a little company called ConneXus in 2001 that is now mediaguide (http://mediaguide.com/about) where we built eerily similar technology ‘fingerprinting’ music and matching it against playback, though in this case it was Radio Broadcast. We were told the problem was virtually impossible to solve, but we were young and cavalier and didn’t know any better, so we solved it anyhow, and that was back in the days of sub 1Ghz CPUs. The algorithm tolerated an amount of fuzzy matching due to audio speed-up which is pretty common by some radio stations (find a radio station that play’s pretty much anything from Madonna’s Ray of Light anywhere near the original speed), so it could match songs that were not exact reproductions of the original in time or in pitch. This presented some challenges because we would occasionally match remixes as the original, and it made the boss really mad :)

  26. One more example of reactive social role of patent system « Eikonal Blog Says:

    [...] Shazam Works” By Bryan Jacobs – http://laplacian.wordpress.com/2009/01/10/how-shazam-works/ Leave a [...]

  27. Help Fight Patent Bullying From Shazam — Spread This Code! | Techrights Says:

    [...] A couple of days ago I encountered this article: How Shazam Works [...]

  28. Skybuck Says:

    I wouldn’t worry about it too much.

    According to dutch “octrooi” law you should be able to examine and study any patent and even reproduce it as long as it used for study.

    You could probably even release it under open source, just like many other open source projects might involve patents.

    A warning that the code might violate certain patents would be nice and probably sufficient.

  29. Lecture 6 (9/3): Trigonometric Integrals | Honors Calculus II Says:

    [...] motivated integrals of the form with a brief story about Fourier series and the mobile phone app Shazam. This entry was posted in Lectures and tagged fourier, integral, shazam, trigonometric. Bookmark [...]

  30. Mago Fabian Says:

    Hi Roy,

    maybe here there are some useful additional information:

    http://www.ee.columbia.edu/~dpwe/e4896/lectures/E4896-L13.pdf

    I’m trying to write something similar on my own, but I’m still far away from getting things done.

    is there a possibility to receive a copy of your amazing code?

    thanks and keep on doing your excellent work!

  31. Japan – News And Articles » Blog Archive » JavaでShazamを作成 – Redcodeを Says:

    [...] 記事:この発生カップル私の日前方法ジャーン取り扱い [...]

  32. How magic works « olliepop's blog Says:

    [...] magic works Jump to Comments How Shazam WorksCreating Shazam in [...]

  33. More Junkmail from Bob, #215 « xpda Says:

    [...] http://laplacian.wordpress.com/2009/01/10/how-shazam-works/ [...]

  34. Shazam – How do they do that? « Rajan's Blog Says:

    [...] What’s the name of that music? Categories: General Tags: Music Comments (0) Trackbacks (0) Leave a comment Trackback [...]

  35. Pedro Says:

    This is what Shazam says how it works.

    But it actually works different:

    If you tag Shazam, the music that is played goes into a huge call center with hundreds of jobless music journalists and record store owners. One hears the music and just writes down the name of the band and Song. If he doesn’t know it he calls out to his co-workers and shouts “Anybody knows this song?”

    This is the true magic of Shazam :-)

  36. Shazam! « inkiostro Says:

    [...] del quale molti di noi si è comprato uno smartphone. In tanti si chiedono come funzioni: qua c'è una spiegazione (un po' tecnica, ma neanche troppo) che toglie un po' [...]

  37. katiedreke: How Shazam Wo… | Planner Collective Tweet Farm Says:

    [...] @katiedreke: How Shazam Works – http://laplacian.wordpress.com/2009/01/10/how-shazam-works/ [...]

  38. Life's simple, why change it? » Blog Archive » links for 2010-09-23 Says:

    [...] How Shazam Works Explanation of how the music identification service, Shazam, works. (tags: shazam music fingerprinting service identification) [...]

  39. So It's Come To This: How Shazam Works Says:

    [...] the one above, it uses the first key (in this case 823.44), and it searches for all matching songs.How Shazam Works – Free Won’t via KottkeTweetShare0 Responses to “How Shazam Works” Feed for [...]

  40. Friday Links: Goodnight Moon Edition Says:

    [...] How does Shazam work? You’re saying there isn’t a team of people over there who are old music fanatics just identifying one song after another? [...]

  41. This Is How Shazam Works (Shazam = iPhone Application That Makes You Look Like A Crazy Person In A Restaurant) | Says:

    [...] Every time I use it I swear it’s magic.  How can a telephone listen to a tiny snippet of a song and somehow tell you the name, artist, and how to buy it?!  Sadly, it’s not magic, but it’s still equally as interesting.  Here’s the full article explaining everything: http://laplacian.wordpress.com/2009/01/10/how-shazam-works/ [...]

  42. Shazam – ecco come funziona il riconoscimento delle canzoni! - iPhone Italia – Il blog italiano sull'Apple iPhone 4, iPhone 3GS e 3G Says:

    [...] [via] Tags: how-to, Shazam Emiliano Contarino (25 settembre 2010 10:35) Condividi tweetmeme_style = 'compact'; L'utilizzo del contenuto di questo articolo è soggetto alle condizioni della Licenza Creative Commons. Sono consentite la distribuzione, la riproduzione e la realizzazione di opere derivate per fini non commerciali, purchè venga citata la fonte. [...]

  43. links for 2010-09-28 « SensDIGITAL S.A.S. Says:

    [...] How Shazam Works addthis_url = 'http%3A%2F%2Fsens-digital.fr%2FshowAndTell%2F%3Fp%3D370'; addthis_title = 'links+for+2010-09-28'; addthis_pub = ''; [...]

  44. Creating Shazam in Java « Mammoth-servers Says:

    [...] A couple of days ago I encountered this article: How Shazam Works [...]

  45. Joegle Says:

    [...] How Shazam works Tagged with: music • programming  0 Comments Leave A Response [...]

  46. » links for 2010-10-07 Thej Live Says:

    [...] How Shazam Works « Free Won’t here is a cool service called Shazam, which take a short sample of music, and identifies the song.  There are couple ways to use it, but one of the more convenient is to install their free app onto an iPhone.  Just hit the “tag now” button, hold the phone’s mic up to a speaker, and it will usually identify the song and provide artist information, as well as a link to purchase the album. [...]

  47. Come funziona Shazam? | Indie Riviera Says:

    [...] dietro c’è anche del tecnologico e poi ha scritto una spiegazione abbastanza dettagliata. Il post di questo “curioso ingeniere” (adesso si dice così nedr?) l’ha segnalato Inkiostro qualche giorno fa, io non ho fatto [...]

  48. Chris Says:

    Great discussion – i’m learning a ton.

    With all the talk about Shazam – SoundHound really blows me away. I had my son sing the Billionaire song and it found it – sure he’s a pretty good singer but there’s no hash-table fingerprint for my son signing this song – that’s pretty magical too. Anyone know the key differences between the two technology approaches?

  49. HackingScene Web News » How to protect photos online | Ask Jack Says:

    [...] have the same fingerprint: these will be identical or highly similar images. (This is basically how the Shazam system identifies music tracks.) One advantage of digital fingerprinting is that it works with any images anywhere: they [...]

  50. Hartelijk maandagmorgen | julesj | blog Says:

    [...] een muziekje luisteren en hij vertelt je precies wat ‘t is. Ook zo benieuwd hoe dit werkt? Lees de complete uitleg en snap er vervolgens nog steeds niets [...]

  51. THE CURIOUS SCHMOOZER – HOW DOES SHAZAM WORK? « uSchmooze – The Networker's Blog Says:

    [...] OK, but how does Shazam make these fingerprints? As Avery Wang, Shazam’s chief scientist and one of its co-founders, explained to Scientific American in 2003, the company’s approach was long considered computationally impractical—there was thought to be too much information in a song to compile a simple signature. But as he wrestled with the problem, Wang had a brilliant idea: What if he ignored nearly everything in a song and focused instead on just a few relatively “intense” moments? Thus Shazam creates a spectrogram for each song in its database—a graph that plots three dimensions of music: frequency vs. amplitude vs. time. The algorithm then picks out just those points that represent the peaks of the graph—notes that contain “higher energy content” than all the other notes around it, as Wang explained in an academic paper he published to describe how Shazam works (PDF). In practice, this seems to work out to about three data points per second per song. [...]

  52. mrk Says:

    ok, for the first time ive found a melody that shazam did not find :( .. but i spend many hours in front the pc, and ive found it. so, my question is , how do i upload this information to shazam, so other peoples if they are searching for this melody, shazam will say the name of the melody.
    the melody is Sound System – Dreamscape.
    thanks.

  53. Manyx Says:

    I really liked the blog and the truth behind shazam. Nice to see.

  54. How Shazam Works… « Voice Recognition Club Blog Says:

    [...] Shazam’s chief scientist and one of its co-founders, explained to Scientific American in 2003, the company’s approach was long considered computationally impractical—there was thought to be too much information in a song to compile a simple signature. But as he wrestled with the problem, Wang had a brilliant idea: What if he ignored nearly everything in a song and focused instead on just a few relatively “intense” moments? Thus Shazam creates a spectrogram for each song in its database—a graph that plots three dimensions of music: frequency vs. amplitude vs. time. The algorithm then picks out just those points that represent the peaks of the graph—notes that contain “higher energy content” than all the other notes around it. In practice, this seems to work out to about three data points per second per song. [...]

  55. Ziiiehhaarmonniicar2008 Says:

    Hallo Ihr Musikhörer. Es gibt eine tolle Webseite bei der eure Freunde umsonst neue Hits downloaden könnt! Ihr könnt direkt auf http://www.party-hits.org/packets/PartyMix_part1.rar gehen und eure Lieblingstracks runterladen. Finde ich schon super klasse. Ist aufjedenfall mein super Tipp für nächstes Jahr !!!

  56. How Shazam Works « zkeesh.com Says:

    [...] full article [...]

  57. Clark Says:

    Is there a way to upload all tags to the computer? website? anything? I’m trying to find a program that will identify the songs and you can view or sort as a whole. I have over 500 tags so if I want to purchase song 280 I don’t want to have to keep scrolling to find that song. It usually crashes and I have to start all over again.

  58. Shazam « The Droning Inquisition Says:

    [...] How Shazam Works and Creating Shazam in Java. [...]

  59. IntoNow iPhone app is like SoundHound for TV | Jeffrey Donenfeld : Blog Says:

    [...] – IntoNow app can tell what show you’re watching, won’t knock your Glee addictionHow Shazam/SoundHound WorksHow Google Goggles WorksHolographic BarcodesTechCrunch – IntoNow Can Hear What You’re [...]

  60. Grant Says:

    Is there a simple way to prevent Shazam from identifying a song with the use of an editor like GoldWave? I’d like to run a song guessing competition but I don’t want any cheaters.

    Thanks.

  61. Shazam! – Identify Music on the Move | My Portable World Says:

    [...] as us and want to know how Shazam works (no, it’s not black magic), you can head over to this blog post where you can read all the tech details on how this service works. You can learn more about Shazam [...]

  62. Joey Says:

    @Grant, You’re in luck: http://www.csh.rit.edu/~parallax/

  63. How does Shazam work? « tec.hojito Says:

    [...] OK, but how does Shazam make these fingerprints? As Avery Wang, Shazam’s chief scientist and one of its co-founders, explained to Scientific American in 2003, the company’s approach was long considered computationally impractical—there was thought to be too much information in a song to compile a simple signature. But as he wrestled with the problem, Wang had a brilliant idea: What if he ignored nearly everything in a song and focused instead on just a few relatively “intense” moments? Thus Shazam creates a spectrogram for each song in its database—a graph that plots three dimensions of music: frequency vs. amplitude vs. time. The algorithm then picks out just those points that represent the peaks of the graph—notes that contain “higher energy content” than all the other notes around it, as Wang explained in an academic paper he published to describe how Shazam works (PDF). In practice, this seems to work out to about three data points per second per song. [...]

  64. John Flynn Says:

    I’m interested in fingerprinting my Son’s album so people who hear his songs on radio can identify them in Shazam and other applications. How do I present the music to them?

  65. THE CURIOUS SCHMOOZER – HOW DOES SHAZAM WORK? | uSchmooze Says:

    [...] OK, but how does Shazam make these fingerprints? As Avery Wang, Shazam’s chief scientist and one of its co-founders, explained to Scientific American in 2003, the company’s approach was long considered computationally impractical—there was thought to be too much information in a song to compile a simple signature. But as he wrestled with the problem, Wang had a brilliant idea: What if he ignored nearly everything in a song and focused instead on just a few relatively “intense” moments? Thus Shazam creates a spectrogram for each song in its database—a graph that plots three dimensions of music: frequency vs. amplitude vs. time. The algorithm then picks out just those points that represent the peaks of the graph—notes that contain “higher energy content” than all the other notes around it, as Wang explained in an academic paper he published to describe how Shazam works (PDF). In practice, this seems to work out to about three data points per second per song. [...]

  66. Bala Sai Krishna Says:

    Can someone explain this?
    after collecting the constellation points, each point is treated as an anchor point and their respective target zone is found, now how is the target zone identified, is it by extracting the nearest neighbors , but the picture(1C – in paper) doesn’t depict that.

  67. Shazam: the instant music guru « criticalnewmediagroup Says:

    [...] How Shazam works [...]

  68. Quora Says:

    How does Shazam work?…

    This link talks about how it works. Also, there is a paper written by its developer/ http://laplacian.wordpress.com/2009/01/10/how-shazam-works/…

  69. brutus Says:

    hey, its very interesting… thanks for the explanation.

    Is there something to know about the error rate? I guess the standard deviation form sample freq. to song freq. is software dependent and manually controllable?

  70. Spectral Analysis with the DFT « Cardinal Peak's Blog Says:

    [...] http://laplacian.wordpress.com/2009/01/10/how-shazam-works/ [...]

  71. How does audio fingerprinting work? - Quora Says:

    [...] into the episode you are.1 CommentLoading… • Post • Mar 4, 2012   Anon User http://laplacian.wordpress.com/2…is a pretty good explanation of… (more) Sign up for free to read the full text. Login if you [...]

  72. raph Says:

    Hi, I talked to a guy who ran a competitor in 2007, 2008. He said the real magic is cleaning up the input and making it accessible to the fingerprinting. Also taking a lot of finetuning is how much deviation to the fingerprint I allow for the song to still be recognized.
    Radio statios like all clearchannel, pitch up the music slightly to make it sound more energetic and upbeat. Shazam still recognizes this.
    Shazam focuses on radio more, so their actual “quick” database doesn’t have to be so big, and can factor in other parameters like location and time, which could in theory reduce the list of candidates to a handful or two.

    Still, great explanation. Very accesible. I read one of the patents and their very generic about the actual procedure, and hard to understand at the same time.
    This is much better! Thanks, apperciated!

  73. Come funziona Shazam (e un clone come Midomi): spiegazione tecnica @ Marco Famà, Fotografo sportivo e di reportage a Torino Says:

    [...] a questo punto, mi fermo. Se siete interessati, vi rimando a questo articolo in inglese (da cui ho preso tutte le info qui sopra) per ulteriori [...]

Comments are closed.


Follow

Get every new post delivered to your Inbox.

Join 26 other followers

%d bloggers like this: