Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

What is Audio Fingerprinting

Audio fingerprinting is the process of representing an audio signal in a compact way by


extracting relevant features of the audio content. It works on the principle of human
fingerprinting. It records the fingerprint of ingested audio content and later can be used
to match with the recorded audio or playlists on mobile, TV or any other device.

Why is Audio Fingerprinting Needed?


There can be multiple reasons to have audio fingerprinting technology in place.
Along with audio track identification, one might want to know the singer/ writer
of the song, wherein the audio track we are listening to. This allows us to
synchronize the multiple audio pieces and provides more interactive
experiences. It is an efficient mechanism to establish the perceptual equality of
two audio objects. The major advantages in developing a system with audio
fingerprints are reduced memory /storage requirements, valuable comparison,
perceptual irrelevancies getting removed and useful search.

An effective audio fingerprinting algorithm involves the following qualities:

 Discriminative power
 Distortion invariance
 Compactness
 Computational simplicity
 Minimum length of query to retrieve

The Problems
It has in the past been very difficult to successfully identify electronic sounds
because of the actual digital audio itself.

Regardless of the equipment you are using and the environment you are in, the
human ear listening to the same piece of audio will essentially hear the same thing.

For example, Blue Monday by New Order is very similar whether it played from an
MP3 encoded file that has high-quality bit rate or from a CD.

The difference is that a computer that has been tasked with opening, storing and
cataloguing the same recordings will identify them as being different.
How Audio Fingerprinting Solves
These Problems
How does modern audio fingerprinting get around this issue? It relies on a software
algorithm to extract a section or small audio sample of the music and generate a
digital summary of the recording’s main attributes.

Parameters such as time, intensity and frequency are used to create a virtual
overview of the anchor points and peaks for these attributes.

Take, for example, if you have a recording with a notable spike in frequency at 1
minute of 823-hertz, the software would mark that spot.

For the whole audio sample, all of the marks that are generated are connected by a
single line. Lots more lines are produced and together these form the fingerprint of
the audio.

What Audio Fingerprinting Can Be


Used For
They have many uses which include, but aren’t limited to:

 Identifying a song
 Identify advertisements through the tunes or melodies it contains
 Music or sound library management i.e. library of sound effects
 Video file identification
 Other media identification

Looking at the above, it would seem that an audio fingerprint is only used to identify
files but its other main use is monitoring and tracking other useful metrics on the
following media platforms:

 Radio
 TV
 Records
 Movies
 CDs
 Streaming platforms
 Peer-to-peer networks.

This type of monitoring has been used to track compensation owed to artists,
ensuring compliance with copyright, tracking interaction and engagement data
(number of listens, duration etc), and licensing among other things.

Properties of Audio Fingerprinting

Accuracy: The number of correct identifications, missed identifications, and wrong


identifications (false positives).

Reliability: Methods for assessing that a query is present or not in the repository of items to
identify is of major importance in play list generation for copyright enforcement
organizations. In such cases, if a song has not been broadcast, it should not be identified as a
match, even at the cost of missing actual matches. In other applications, like automatic
labeling of MP3 files (see Section 6), avoiding false positives is not such a mandatory
requirement.

Robustness: Ability to accurately identify an item, regardless of the level of compression and
distortion or interference in the transmission channel. Other sources of degradation are
pitching, equalization, background noise, D/A-A/D conversion, audio coders (such as GSM
and MP3), etc.

Granularity: Ability to identify whole titles from excerpts a few seconds long. It requires to
deal with shifting, that is lack of synchronization between the extracted fingerprint and those
stored in the database and it adds complexity to the search (it needs to compare audio in all
possible alignments).

Security: Vulnerability of the solution to cracking or tampering. In contrast with the


robustness requirement, the manipulations to deal with are designed to fool the fingerprint
identification algorithm.

Versatility: Ability to identify audio regardless of the audio format. Ability to use the same
database for different applications.

Scalability: Performance with very large databases of titles or a large number of concurrent
identifications. This affects the accuracy and the complexity of the system.
Complexity: It refers to the computational costs of the fingerprint extraction, the size of the
fingerprint, the complexity of the search, the complexity of the fingerprint comparison, the
cost of adding new items to the database, etc.

Fragility: Some applications, such as contentintegrity verification systems, may require the
detection of changes in the content. This is contrary to the robustness requirement, as the
fingerprint should be robust to content-preserving transformations but not to other distortions

Applications of Audio Fingerprinting


Music Retrieval
Music recognition is an integral application of audio fingerprinting. The specific
feature of a song or music signal is captured as a fingerprint. This unique
metadata made it possible to identify and retrieve the song from millions of
databases. An ideal audio fingerprinting system will give accurate retrieval result
even in a noisy environment. An efficient searching algorithm is also required to
provide the best and precise results.

Watermarking
Digital audio watermarking allows to embed a secret digital signature to the
audio signal. Audio fingerprinting allows the signal processing approach to
generate the digital signature. This embedded signature can be used to verify
the authenticity of the audio signal. The advantage of using fingerprinting
techniques is that it is less vulnerable to attacks since any attempt to change the
fingerprint will alter the quality of the audio.

Broadcast Monitoring
Audio fingerprinting techniques are used to monitor a song or an advertisement
broadcast by the TV or radio channel. This might be useful as the royalty for the
song is applicable while it broadcasts on TV or radio. Similarly, advertisers pay
based on the number of times it airs in the media. Audio fingerprinting
techniques are commonly used for TV viewership analytics and content
protection also.

Voice Identification – For Automotive Use-case


Audio fingerprinting techniques can be used to identify the voice. With the
advancement of smart devices, nowadays, the demand for a robust speaker
identification system has increased. Lots of automotive use cases like car
infotainment systems, where voice-based commands with user profiles, can be
realized with an accurate fingerprinting system.
Other applications include automatic music library organization using genre
classifications, music trainer using vocal analysis, audio geographical mapping,
etc.

You might also like