The FezGuys
Variable Bitrate MP3 Encoding
[ No. 57 - July 2001 ]

To those with trained ears, much of what passes for Internet Audio is defined by its limitations. Some sanguine engineers, realizing the value of balancing speed with pristine sound, have incorporated an oft-overlooked but eminently useable MP3 encoding technology called variable bitrate (VBR). VBR is used for many other encoding systems and codecs, especially video. Part of the MPEG standard, VBR was first implemented for MP3-encoded files by Xing Technologies (the developer of AudioCatalyst - now owned by Real Networks) and has been available long enough that all recent MP3 players support it. Our context here is for MP3 files.

Simply put, VBR technology allows better sounding MP3 files in the same space taken up by your garden-variety constant bitrate (CBR) encoded song. To those who value small file size more than sound quality: think of it as providing the same sound quality in a smaller file or, conversely, better sound quality in a similar file size.

Let's take a closer look at how VBR works. Standard MP3 files are encoded at a single bitrate through the entire file (hence the CBR moniker). In contrast, VBR encoding looks at the audio file and chooses what bitrate to encode based on how much audio information is present at any given moment. A song that begins quietly or with a single musical instrument will, during that section, be encoded at a lower bitrate then the middle of the song when all the instruments are playing together and the volume and frequency range is high. As a result, most songs will be encoded at several different bitrates corresponding to fluctuations in dynamic range. The key conceptual difference between CBR and VBR is that where in CBR encoding you specify your compression by space, with VBR you specify it by quality. With CBR, sonic quality is consistently reduced to maintain the bitrate you specify. With VBR, the bitrate is changed to meet the quality level desired. CBR is inefficient in that 10 seconds of silence encoded at 128kbps requires the same file size as 10 seconds of full-on opera. VBR encoded files of the same audio would result in a very small file for quiet sections and larger sized files for loud sections (depending on the quality you chose). Like CBR, VBR encoding parameters are set at the time the song is encoded.

The MP3 standard is designed so that information about the encoded bitrate level is included throughout the file. This makes it easy for VBR-enabled MP3 players to seamlessly decode and play VBR files. In fact, users are likely to have a more consistent listening experience with frequent bitrate changes in a VBR file than with CBR. In a CBR file, if you suddenly reach a more dynamic piece of a song, you're stuck with the same bitrate. This can result in audio artifacts (most noticed when encoding to lower bitrates - 64kbps and less).

VBR Comparisons

VBR encoding offers the ability to compress audio more efficiently by specifying the desired sound quality level rather than by setting a flat, numeric bitrate. Sounds great, yes! Alas, there is no standard for how to select this "quality" in your encoder and that results in different settings for each encoding application. To help you figure things out, we've included a comparison between some common encoders for Windows and Macs. For each setting available we list what we'll call the "base bitrate"-- the starting point that VBR uses to encode. Compared to the same rate used for CBR files, a VBR-encoded file will sound better because it switches to higher bitrates as needed. We assume you Unix users can figure this out yourselves!

Windows Encoders
AudioCatalyst <>
VBR slider setting: "Quality"
Options available:
  Low - 96kbps (Near CD Quality - Acceptable for portables)
  Low/Normal - 112kbps (CD Quality - Best for portables)
  Normal - 128kbps (CD Quality - Best for most)
  Normal/High - 160kbps (Archival Quality - High-end stereos)
  High - 192kbps (Archival Quality - Highest-end stereos)
MusicMatch <>
VBR setting: "Custom Quality" percentage slider
Options available: 1%-100%, samples follow (see above audio description):
  10% = ~95kbps
  25% = ~105kbps
  50% = ~128kbps
  75% = ~170kbps
  100% = ~220kbps
Macintosh Encoders
AudioCatalyst <>
VBR slider setting: "Quality"
Options available:
  Low - 96kbps (Near CD Quality - Acceptable for portables)
  Low/Normal - 112kbps (CD Quality - Best for portables)
  Normal - 128kbps (CD Quality - Best for most)
  Normal/High - 160kbps (Archival Quality - High-end stereos)
  High - 192kbps (Archival Quality - Highest-end stereos)
  Very High - 224kbps
  Ultra High - 256kbps
Note: AudioCatalyst (Mac version only) has a glitch where if you have "CBR Quality" set to 24kbps or less, it overrides the "VBR Quality" regardless of which you have selected in "MP3 Mode".
iTunes <>
VBR setting: Quality
Options available: Lowest, Low, Medium Low, Medium, Medium High, High, Highest
iTunes has a preference called "guaranteed minimum bit rate" to control the VBR encoding. We couldn't find detailed documentation on how iTunes' VBR works, but trial and error showed the minimum bitrate setting drives the encoding. Setting it to 8kbps resulted in 8kbps files at all quality levels. We suggest using one of the tables from the other apps listed here as a reference for your preferred settings.

Note: MusicMatch wins two prizes this month: (1) The "Cross Platform Award" for adding Macintosh and Linux support and (2) the "Bad Boy of Media Types Award" for setting itself as the default player for all audio formats during regular installation (even after letting us specify not to).

Test Case

For Windows encoders, we used a PIII/600 with 256MB RAM and for Mac encoders we used a G4/450 with 256MB RAM.

Encoding Speed: Across all encoders, the difference between VBR and CBR 128k was only about one second. The fastest encoder was AudioCatalyst for Windows (averaging 5.9x realtime) and the slowest was AudioCatalyst for the Mac (due to its loading the track from the CD first, and encoding afterwards).

File Size: Here you can see the difference in file size for both clips at a few different VBR settings and how they compare to the typical 128k CBR. The number with each VBR listing is the "base bitrate" mentioned above.

Music CBR(128k) VBR(96k+) VBR(128k+) VBR(256k+)
File A 470K 368K 513K 983K
File B 470K 333K 485K 973K

Quality: We feel the 96k VBR files are roughly equivalent in sound quality to the 128k CBR files. This is because when the music needs it, VBR-encoded files bump up to 128k (or more). Most encoders tend to increase rather than drop bitrates, so "base bitrates" we've shown indicate (especially in iTunes' case) the lower-end used throughout the resulting file.

Other Observations and Notes: The file size chart demonstrates the effect of a wide dynamic range on VBR technology. Using standard CBR encoding, file sizes were (of course) identical for both File A and File B. However, in VBR, file size difference becomes noticeable with File A (complex classical music) requiring more bits to represent the same sonic experience than File B (standard rock).

VBR is a good idea and now makes us want the MP3 players to have their groovy visualization plug-ins change as the music changes. If the bitchin' graphics could react in some way as they detect bitrate changes in VBR-encoded files, it would be cool. We noticed that Winamp does have a handy feature on VBR playback - you can see the bitrate changing as the song plays.

A Caution: Some older MP3 players may not be able to play VBR MP3 files. However, they are somewhat rare and there are a wide variety of current free players that work fine. Make sure the links you provide to MP3 players on your web site work properly.

Even modern players that play VBR might still have problems displaying accurate track lengths and seeking to the right place in the file. It's a result of the method used to figure out how long a song is and doesn't affect playback quality of the song. With CBR files, players can perform simple division of file size by bitrate, but that becomes more complex with VBR. A small detail, and not likely to cause any real grief.

Of course, quality is in the ear of the beholder, so we encourage you to do your own tests, or check out our resulting clips for yourself. You can listen to our results yourself by clicking on the sizes in the table above.

Visit us: <>



About the authors:

Jon Luini is a working technophile, a musician (bass player/singer) with full-blown facility and extensive experience on the Web and no free time. He is a co-founder of IUMA and MediaCast, co-creator of Addicted To Noise, and runs an Internet and music consulting and technology company, Chime Interactive (formerly Evolve Internet Solutions). <>

Allen Whitman is a working musician (bass player/singer/producer) with a keen, real-world interest in the practical use of the Web. Music credits include: The Mermen, "Brine-The Antisurf Soundtrack, biL, Deep Field South, Doormouse, Delectric and Drizzoletto. He has written for the San Francisco Examiner, Wired, EQ, Revolution, Yahoo Internet Life, Prosound News, Surround Professional, Replication News and others. <>

©1996-2003 The FezGuys™