Saturday, July 26, 2014

Audio Compression Techniques

Sound is nothing more than a vibration in the material (usually air or water) surrounding us that we can recognize as such. These vibrations need to happen between 20 times and 20,000 times a second for us, as humans, to hear them. To make it easy for us to speak about, we refer to these vibrations in either cycles per second (CPS) or Hertz (Hz). The fact that humans are sensitive to this range of frequencies is where the golden yardstick for audio equipment comes from; we can hear audio between 20Hz and 20,000Hz.

For years, audio was recorded in its original analog format in the form of records and, later, magnetic tape. By the early '80s, there was a real push to produce a more portable and reliable format for audio storage and, through the miracle of digital, we have that in the form of audio CDs.

For a more in-depth look at the state of audio, read through this book by Ken Pohlman.

In the jump from analog to digital, certain things must happen. The first is to sample the analog audio and then digitize it. With CD audio, this sampling takes place 44,100 times every second with each sample being represented with a 16-bit digital word. The resulting digital file tends to be pretty large; a one-minute stereo sound file, at CD-quality, eats up about 10MB of storage. Upon the release of the original audio CD, there were rumors that it was designed to hold 74 minutes of hi-fidelity music which corresponded to Wilhelm Furtwängler's recording of Beethoven’s 9th Symphony from the 1951 Bayreuth Festival. In later years, this rumor was squelched, but it still makes a pretty good story. With later updates to the audio CD format, present audio CDs hold 80 minutes of audio.

For those of us who lived through the early days of digital audio, we remember what it was like to try and carry a portable CD player as gently as possible. We needed to make sure the player was not subjected to any bumps because that would cause certain skips and pauses in the music. For that and several other reasons, portable digital audio started to take shape as a purely silicon-based item. Some of the digital audio players from the late '90s only had 32MB of storage on-board. Without extra help, those little players could only hold a little over three minutes of audio; enter the idea of data compression.

With data compression, there is an attempt made to reduce the size of the digital data. Grossly oversimplified, there is lossless compression and lossy compression. Lossless compression looks at digital data that is repeated and removes that data after having made a note that it exists. This type of compression can reduce an audio file 50% to 60% in size. When reconstructed, a lossless file will be mathematically the same as the original file. FLAC files are an example of a lossless compressed audio file.

Lossy compression goes a step or two further and, with the help of psychoacoustics, tries to predict what sounds would be either outside our hearing range or those masked by other sounds happening at the same time. For example, many people cannot hear frequencies about 15,000Hz, so the MP3 format simply throws that data away. As I mentioned in an earlier post, and demonstrated with photos, the MP3 compression format really negatively affects an audio file. For a much more meaningful explanation on compression, have a look at this research paper: Introduction to Data Compression, by Guy E. Blelloch .

Psychoacoustics is a process best explained with a demonstration. Assume for a moment that two people, Phil and Bob, are standing within 20 yards of a Civil War era cannon. Just as Bob starts speaking to Phil, the cannon is fired producing a deafening BOOM! As is predictable, Phil is not able to hear what Bob says. This same theory is used with psychoacoustics to predict what sounds in an audio file would not be heard and, therefore, could be removed from the data file. Typical sound files, e.g. MP3 and AAC, use psychoacoustics to create files that are approximately 1MB in size for every minute of audio present.

An inexpensive way to find out more about the sound quality differences between lossless and lossy files would be to purchase a SanDisk Sansa Clip+.

No comments:

Post a Comment