Your browser does not support this image.
                This image is of the heading of the page.

Click the following links to learn more about sound:

Sound:

dosits.org

Sound Waves:

macrosonix.com

Waveforms:

techtarget.com

Auditory Perception:

wisegeek.com

Amplitude:

howmusicworks.org

Decibels:

earq.com

Sound Digitization:

cs.cf.ac.uk

Sampling:

clara.net

Quantization:

mediacollege.com

Sound Compression:

music.tutsplus.com

Digital Audio Compression:

ncsu.edu

MP3 Compression:

techradar.com

Threshold of Hearing:

hyperphysics.phy

Masking

vivid-acoustics.com

On this page you will learn about sound compression, MP3 compression and you will learn how to carry out file size calculations.

What is Sound Compression?

The Internet is used by many multimedia applications. A three minute song that has been recorded in stereo will take up 25 MB. Considering both of these points, there is a great need to minimise the amount of data used – this is known as compression - when audio is used in multimedia, especially when it is delivered over the Internet. When compression retains all of the original data from a file, it is known as ‘lossless’, when compression doesn’t retain all of the data, it is known as ‘lossy’ (I bet you’ve never heard that word before!).

Digital Audio Compression

As sound waveforms are complex and unpredictable, it is difficult to compress them by using lossless methods (compression without the loss of data). Other than digital video, the data rates that are included with digital audio are quite large. Digital audio compression results in more efficient storage and transmission of data in regards to audio.

MP3 Compression

As uncompressed files can be very large, they must be compressed. MP3 compression is lossy, as the original MP3 file can’t be reconstructed to its full original quality. However, when compressed, MP3 files do retain acceptable quality, in fact most people won’t even realise that they have been compressed. When compressed, the size of MP3 files are greatly reduced.

Your browser does not support this image.
                This image shows the process of compression.

MP3 compression is achieved by discarding the less important parts of the audio spectrum, including higher frequency sounds and sounds that may be hidden behind louder sounds. Effectively, the compression discards the data which has no perceivable effect on the sound, meaning the change shouldn’t be noticed. MP3 processing uses a processing technique that is based on two principles, the threshold of hearing and masking.

Threshold of Hearing

The threshold of hearing is the minimum level of a sound that can be heard. This technique discards sounds that are outside the human auditory spectrum, or sounds that humans can’t hear, and includes any sounds below 20 Hz or above 20 kHz. These are discarded because there’s no point of the sounds being included in the file if they can’t be heard. The threshold of hearing varies non-linearly with frequency, as the image below shows.

Your browser does not support this image.
                This image shows how the threshold of hearing is shown on a graph.

Masking

Masking follows the basic principle of modifying the threshold of hearing curve in the area around a loud tone. This can be viewed visually by looking at the image below. The loud tone that the curve is based around is known as the masking tone. A sound that exists within the curve is inaudible, despite the fact that it rises above the unmodified threshold of hearing. Due to this, sounds below the curve require less data when encoded.

Your browser does not support this image.
                This image shows the masking tone on a graph.

File Size Calculations

Audio files come in different sizes, and the size can be important to know for many reasons. For example, somebody may want to use less space from their files when creating a website, or somebody may simply be trying to save drive space. It is good to be able to carry out some file size calculations.

Important Information to know

  • Waveform audio files (WAV) store uncompressed audio
  • Audio has a resolution (measured in bits) and a sampling rate (measured in Hz)
  • The bit-rate is the number of bits that are processed per second
  • Bit-Rate (kbps) = Sampling Rate x Resolution x Number of Channels x Time (Seconds) / Bits per Kilo-Bit
  • The bit-rate can be used for the calculation of file sizes of digital audio files
  • There are 8 bits per byte
  • Mono audio files have 1 channel while stereo audio files have 2

Example - Uncompressed CD Stereo Audio Track

Firstly, we’ll go through an example, finding out how many MB (megabytes) of space a 60 second uncompressed audio file requires while using an uncompressed CD stereo audio track.

First, we need to work out the bit-rate in kbps, so we use the formula:

Bit-Rate (kbps) = Sampling Rate x Resolution x Number of Channels x Time (Seconds) / Bits per Kilo-Bit.

We know that CDs use a sampling rate of 44.1 kHz, or 44,100 Hz. We know that they have a resolution of 16 bits. We know that stereo files use 2 channels. We will use 1 second for the moment, before converting it later on so it’s easier. We know there are 1,000 bits per kilo-bit.

So, we get the calculation 44,100 x 16 x 2 x 1 / 1,000 = 1411.2 kbps.

As we’re considering a 60 second audio file, we multiply our answer in kbps by 60, and then divide the answer by 8 to get the answer in bytes: 1411.2 * 60 / 8 = 10,584 KB, which means our final answer is 10.6 MB of space.

Try it Yourself!

Try out the two following practice questions. Work out what you think is the correct solution and then compare your results to the ones hidden below. Warning: the answers are below, so if you are trying these out yourself, don’t continue reading until you’ve got your answer.

Practice Question 1

Calculate the file size in MB, of a 2 minute WAV stereo audio file sampled at 22 kHz using a 16-bit resolution.

Solution: 22,000 (Sampling rate) x 16 (bits) x 2 (number of channels, two for stereo) x 120 (two minute file, 120 seconds in two minutes) / 1,000 (bits per kilobit) = 84,480 / 8 (bits per byte) = 10,560 / 1,000 = 10.56 MB file size.

Practice Question 2

Calculate the total file size (in bits), of a 1-second WAV mono audio file sampled at 44.1 kHz using an 8-bit resolution.

Solution: 44,100 (Sampling rate) x 8 (bits) x 1 (one mono channel) x 1 (one second file) = 352,800 bits.