An Introduction to Compressed Audio with Ogg Vorbis

copyrighted by Graham Mitchell (see the bottom for permission to reprint this article)
Last updated: Tuesday, 2003-06-25 at 13:21 CDT

This document is an introduction to compressed audio. It covers the basics of sound waves, how audio is stored digitally, why compressed audio is necessary, the basics of lossy audio compression, and why Ogg Vorbis is a good choice for compressed audio. It assumes a basic familiarity with physics and computer terminology, including the terms “bit”, “byte”, and “file”.

It also seems to be one of the most popular introductions to Ogg Vorbis on the web, and has been translated into half a dozen languages and has over 1,000 readers each month.
Table of Contents

The Sounds of Music – short introduction to wave theory and the wave properties of sound
Digital Audio – how analog sound waves are represented in a digital format
The Size Problem – why there is a desire to compress digital audio
Lossless Compression – on the sort of compression that loses no information
Lossy Compression – how imperfect compression can still sound the same
A Bit on Bitrates – definitions and discussion of “bitrates”
CBR? VBR? ABR? – different types of bitrate management schemes
Just Say No to Bitrates – why using bitrate as a measure of sound quality is fundamentally flawed
Why Ogg Vorbis? – why I think Ogg Vorbis is the ideal compressed audio format for many uses
Why Not Ogg Vorbis? – reasons why Ogg Vorbis might not be the ideal compressed audio format for you
What Quality Should I Use? – discussion about proper listening testing and suggestions for quality
A Note About “Transcoding” – why it’s a bad idea to convert from one compressed audio format to another (e.g. MP3 -> Vorbis )
Reprinting This Article – how you can legally translate or reprint this article without getting explicit permission
Version History – the changes this document has seen over time

The Sounds of Music

“The hills are alive with the sound of music….”
– Christian, Moulin Rouge

Music is made up of waves. When a violin player bows a string, the string vibrates at a certain frequency and creates a sound wave, which travels through the air, hits your eardrum and causes it to vibrate. Your brain interprets the signals coming from your eardrum and “hears” a sound.

Likewise, everything else you can hear is because something is vibrating and creating sound waves. In a trumpet, it’s a column of air. With an electric guitar, the vibrating strings send a signal through the amplifier, which causes a speaker cone to vibrate in the same manner as the original string. When you speak or sing, it’s your vocal cords vibrating. All of these things generate sound waves.
Analog Wave

The properties of these waves affect how they sound. The frequency of a wave refers to how many times per second the wave transitions from its highest point to its lowest point and back again. This is typically measured in hertz (Hz), or number of cycles per second. The frequency of a wave determines its pitch. High frequency waves have a high pitch, and low frequency waves have a low pitch. The average human can hear frequencies from 15 or 20 Hz to roughly 20,000 Hz (20 kHz).

The amplitude of a wave refers to half the distance between a wave’s highest point and its lowest. The larger the amplitude of a wave, the louder its volume, which is typically measured in decibels (dB). The decibel range for human hearing is complicated and depends on the frequency of the sound in question, but roughly ranges from 0 to 120 dB, with each change in 10 dB corresponding to a doubling of the perceived volume. (Yes, I know that 3 dB doubles the energy, but that’s not the same thing.)
Digital Audio

As early as World War II, engineers were experimenting with digital audio, converting the analog waves of sound into discrete values. This was accomplished by “sampling” the sound wave many times per second, with each sample recording the amplitude of the wave at that point (including whether the wave was “up” or “down”). By the Nyquist Theorem, the sample rate (number of samples per second) must be at least twice as high as the highest recorded frequency to prevent weird artifacts in the recording.

So, in the 1970’s, when Philips and Sony began looking for a way to improve audio quality for recorded music, they turned to digital sampling. A sample rate of 44,100 samples per second (44.1 kHz) was chosen because it exceeded the target sample rate of 40 kHz (twice the highest frequency humans can hear, 20 kHz) and because that’s how much information could be stored on a video tape, the storage medium of choice until the little silver plastic discs we know as CDs were perfected.
Digital Wave

Each “sample” is a 16-bit number, ranging from -32,768 to 32,767. This number indicates the amplitude of the wave at the instant of sampling. Thus a sampled wave oscillating back and forth from -32,768 to 32,767 would be the loudest wave this format could represent, a wave changing from -1 to 1 would be the quietest, and a bunch of zeroes in a row would indicate complete silence. This range of values for the amplitude is fairly fine-grained, which allows even subtle volume differences to be accurately represented. Sampling audio in this digital fashion is known as Pulse Code Modulation (PCM), and is the most popular method of digital sampling.

PCM digital audio produces quite an accurate picture of the “live” sound, and only the keenest listeners with good equipment can distinguish between it and the original.
The Size Problem

It is possible (and fairly easy) to “rip” the audio data from a CD and store it into “WAV” files on a computer, and these files can be played back on demand. So ideally, you’d want to hear your music at this quality everywhere, since it’s the highest quality you can typically purchase. You’d want copies of your music, at this quality, in your car, on your computer, in your portable music player, and in your stereo. Why is this not currently feasible? The answer is size.

A little math can reveal the space required to store sound information at this quality. Each sample is 16 bits, or two bytes. There are 44,100 samples each second, and since modern music is recorded in stereo, there is both a left and a right channel. This results in ( 2 * 44100 * 2 ) = 176,400 bytes to store one second’s worth of samples. This means 10,584,000 bytes or approximately 10 megabytes to store just one minute of CD-quality audio. This may not sound too alarming, given many have hard drives holding tens and even hundreds of gigabytes, but it adds up quickly.

My personal music collection currently consists of 1307 songs on 102 different albums (a couple of these are double albums). The total playing time of all 1307 songs combined is 5,423 minutes and 23 seconds (over three days and eighteen hours) and so would require an estimated 53 gigabytes of hard drive space to store in perfect CD-quality!!! With a little cash, a personal computer could have that much storage for now, but most portable music players have less than 1% this much space.

As video DVDs with “surround sound” audio become more popular, this will only become more of a problem: such audio typically contains 5 channels (left, right, left rear, right rear, and center), nearly tripling the space requirements! And for DVD Audio discs it’s even worse: up to six channels with 24-bit samples at 96 kHz, requiring almost ten times the space!

Clearly, for the near future (at least until portable music players have hundreds of gigs of storage), you won’t be able to carry around your entire music collection.

Or can you?

Fortunately, there is a solution. Compression is the technique of making a file take up less space while still containing the same information. There are two categories of compression: lossless and lossy.
Lossless Compression

Lossless compression means that the compressed, smaller file can be expanded back into the original file without losing any information whatsoever. That is: take a file; compress it, and uncompress it again. If the original file is bit-for-bit identical, 100% of the time, for any given input file, then the compression scheme is lossless. No information is lost.

Unfortunately, compressing audio losslessly is hard. General-purpose compression programs like WinZip and gzip only manage about 5% on average. Even “next-generation” utilities like WinRAR and bzip2 only manage a few percent more.

There are special-purpose compressors (like flac) which were designed solely for losslessly compressing audio, but even they only manage about a 50% reduction in filesize on average. While this is enough for some, for music files to be truly portable they must be even smaller.
Lossy Compression

Lossy compression is any compression which causes information to be lost. Compressing and then uncompressing a file results in something similar, but not identical, to the original file. This is no good for things which must be interpreted by a computer, like executable programs/applications or most computer-readable data, but is often just fine for things where the interpretation is being done by a human (like photographs or sounds). The trick is to remove little bits of information in places where it can’t be perceived.

Lossy audio compression works using a psychoacoustic model. That is, by modeling how your ears (and your brain) hear sound, it is possible to find places to remove information that you wouldn’t have perceived anyway. A full treatment of these techniques is beyond the scope of this document, but here are two simple examples:

Though humans can technically hear tones up to 20 kHz in pitch, most can’t hear anything above 15 kHz, especially when other sounds are present. However, most CD-quality audio contains information for reproducing these tones anyway. By filtering out tones outside this range, you reduce the amount of information that has to be stored without affecting the perceived sound quality. (And even humans that can hear such tones wouldn’t have heard them anyway on cheap computer speakers unable to produce such frequencies.)

Similarly, if a piece of music contains a loud bass drum hit (such as most rock and roll, a couple of times each second), the eardrum is too busy reacting to the percussive hits of the drum to register any other sounds at all for a few milliseconds. By simply omitting the samples immediately after such sounds, less information can be stored while still maintaining the same perceived sound quality. (Note: this is merely an example. I am not aware of any encoder which actually does this.)

Using sophisticated techniques such as these, lossy audio compression formats such as Ogg Vorbis and mp3 can achieve results which are provably indistinguishable from the original, CD-quality sound but are a mere 10 to 20% of the size.

And what’s even better, being more aggressive with these techniques can result in files which are less than 5% of the original size but still sound quite good on normal equipment (think FM radio quality).
A Bit on Bitrates

The ultimate size of such files is driven by their bitrate. That is, how many bits does the compressor (a.k.a. encoder) use to represent each second of audio. Actual, uncompressed, CD-quality uses 176,400 bytes or 1,411,200 bits to store each second. This is roughly 1411 kilobits per second, or 1411 kbps. Typical lossy formats would only use anywhere from 64 to 256 kbps to store the “same” information.

The problem is that bitrates only speak to the size of the file, not its quality. For example, one could write a compression format that achieves a 256 kbps bitrate by taking only the first 256,000 out of every 1,411,200 bits (18%) in any given second. Although some foolish people might assume a song encoded in this format would sound better than a typical 128 kbps mp3, any listening test would be able to easily prove the inferiority of such a technique.

The mp3 format, developed by Fraunhofer and Thomson, is a heavily-patented format and was ground-breaking in its time. Because it was the first widely-adopted lossy audio compression codec, people associate certain bitrates with certain levels of quality.

However, even within the aging mp3 format, and even within a single bitrate (say, 128 kbps), the sound quality of various encoders varies drastically. The Xing encoder is fast but produces poor-sounding files even at 128 kbps. The Lame encoder is a bit slower but produces markedly better-sounding files at the same bitrate.

Newer lossy audio compression codecs like WMA and Ogg Vorbis use different psychoacoustic models and noticeably improve sound quality at a given bitrate even over the best mp3 encoder.
CBR? VBR? ABR?

To further muddy the waters, not even bitrate alone will tell the whole story. Early mp3 encoders (and most to this day) used what is called an “average bit rate” (ABR). If, for example, we encode a file at 128 kbps, then the encoder will use 128 kilobits to encode each second of the song, no matter what. So the first measure (consisting of perhaps two drum clicks) will use 128 kilobits and will represent that second nearly exactly. On the other hand, halfway through the song, where the lead guitar is ripping into a solo, the drummer is going crazy on the cymbals and the bass guitar is playing a funky groove, the encoder will still have to use 128 kilobits to encode that second, where it could have used, say 300. Thus that second will be represented rather poorly.

Newer mp3 encoders (like Lame) support what is called “variable bit rate” mode (VBR). This gives the encoder the freedom to save bits on simple sections that don’t need as many bits to represent them well and thus have some “extra” bits left over to use for sections that really need them. This usually results in files which are slightly smaller than ABR files even at the same target bitrate, but which sound much better in the busy sections.

Occasionally you may see mp3s referred to as CBR, for “constant bit rate”. This would mean that every sample of the encoded file must use exactly the same amount of bits. In reality, mp3 uses a bit reservoir to average bit rates over a small period of time, so technically ABR is being used. It is unlikely that any compressed audio formats use true CBR.

Unfortunately, though some newer mp3 encoders do support VBR, some portable hardware mp3 players can’t play such mp3s. And even when their encoders and players support this mode, many people don’t use it for whatever reason. (Habit? Ignorance?)

Almost every lossy audio compression codec newer than mp3 supports VBR, though some don’t turn it on by default.
Just Say No to Bitrates

Testing by Fraunhofer and Thomson found that for mp3s, 256 kbps was true “CD-quality”; that is, their sound engineers could rarely tell the difference between mp3s encoded at that rate and the original CDs. These files were roughly 20% of the original filesize, but virtually indistinguishable in quality.

Since then, 128 kbps mp3s have become the “standard”. Although most people with good equipment can hear the difference, they still sound good enough for the average listener. These files are roughly 10% of the original filesize, and are the source of the 1 megabyte per minute rule some stores quote for determining how much music you can fit on a given portable mp3 player.

People with dull ears, bad equipment, or an strong desire to fit twice as much music on a portable player encode mp3s at 64 kbps. These files sound not much worse than FM radio, and are a mere 5% of the original filesize. (Most, but not all, portable players that advertise “stores over 2 hours of music” are assuming the use of 64 kbps mp3s, rather than the more common 128 kbps mp3s.)

The problem with these rules of thumb is that they only work for mp3s. There are now many formats newer than mp3 which all improve on sound quality. And, as more sophisticated psychoacoustic models are developed, the difference in sound quality between one bitrate for mp3 and the same bitrate for a different format will only continue to widen.

For example, a file encoded in the Ogg Vorbis format at “quality 3″ typically results in an average bitrate of 112 kbps but sounds better than an mp3 at 128 kbps and often as good as an mp3 encoded at 160 kbps.

For this reason, the Ogg Vorbis community discourages users from trying to achieve a specific bitrate when encoding but instead to concentrate on sound quality. In fact, the Ogg Vorbis format encoders don’t normally consider bitrate at all (the default mode of operation is VBR), instead using a “quality” rating, which ranges from -1 to 10 in increments of 0.01 or so. This quality rating is a measure of how close to the original the compressed file should sound; the encoder uses as many or as few bits as necessary to satisfy the quality requirement. Each quality setting results in a rough average bitrate for a piece of average music, but this is a by-product of how the encoder has been tuned; the encoder does not aim at any particular bitrate.

The default quality setting is 3, which should be fine for the average user since it gives sound quality better than a 128 kbps mp3 but is over 10% smaller. Someone wanting sound almost identical to a 128 kbps mp3 can usually get by with quality 2, which sounds as good but is 25% smaller.

The rest of this document assumes that you will be encoding music using the Ogg Vorbis format.
Why Ogg Vorbis?

Ogg Vorbis is a good choice because the sound quality is among the best of the newest formats out there. Recent double-blind listening tests put Ogg Vorbis among the highest quality of all the “second-generation” compressed audio codecs. This means you either save space and get the same quality, get higher quality for the same space requirements, or a combination of both (i.e. a little smaller and a little better-sounding).

Secondly, Ogg Vorbis is not only Open Source (BSD license), but is completely patent-free. This means that hardware manufacturers wanting to support Ogg Vorbis in their portable music players can do so without paying license fees, unlike most other formats. Software developers can use the Ogg Vorbis format for music/sounds in their games without having to get permission from some powerful company and without paying royalties. And the open nature of the code for the format means that many people have the freedom to port the tools to many other systems or add features, fix bugs and improve the code if they so desire. In fact, the BSD license allows for developers to modify their code to suit their own needs, and they don’t even have to publish their changes! Most other formats are heavily patented and tightly controlled.

Finally, the format is well-designed to have several features some of the others don’t. Those familiar with id3 tags for mp3 files will be well aware of their limitations; Ogg Vorbis features a flexible tagging standard which allows complete customization of tags for a given file, including user-defined tags (like “remixed by” or whatever you like).

Ogg Vorbis files support “bitrate peeling”, which means you can produce a lower bitrate file from a higher bitrate file without re-encoding and at the same quality as if you’d encoded the file directly into the lower bitrate from the original file. No other lossy audio codec currently supports this. (N.B. Current files are peelable, but not very well. Good peeling support requires the encoder to be redesigned to store data in a more peeler-friendly (but still backwards-compatible) format. This is being worked on, slowly, but isn’t currently a high priority.)

And Ogg Vorbis files are not limited to merely two channels of audio (left and right). They support up to 255 distinct channels, and thus are a natural fit for encoding the 6 channels of DVD audio alongside your DivX ;-) video.

Just to be clear, strictly speaking, the name “Ogg” refers to a generic container fohttp://grahammitchell.net/wp-admin/post.php?post=5&action=edit&message=1rmat which could hold many types of multimedia files (lossy compressed audio (Ogg Vorbis), lossy compressed audio designed for speech (Ogg Speex), lossless compressed audio (Ogg Flac), lossy compressed video (Ogg Theora), etc). “Vorbis” is the lossy compressed audio codec which is typically transported in Ogg files. “Ogg Vorbis” refers to both parts together: an Ogg format file containing audio compressed using Vorbis. And throughout this document, I’ve used the term “ogg” to generically mean “a file containing compressed audio in Vorbis format” since that’s the file’s extension, just like I’ve used “mp3″ to mean “a file containing compressed audio in MPEG layer 3 format”.
Why Not Ogg Vorbis?

There are a few domains where use of Ogg Vorbis probably isn’t appropriate right now, and they mostly stem from ubiquity. Support for mp3s is widespread, since it was the first popular compressed audio format and has been around the longest. This means that you’ll have a hard time finding a portable music player that will play Ogg Vorbis files. Several are coming to market in the next few months but for now support is virtually non-existent.

Similarly, if you’re planning on distributing music in vorbis format, you may have to help your recipients get something set up to play them. Anecdotal evidence indicates that most Windows users play audio using Windows Media Player, which doesn’t support Ogg Vorbis files by default (though I’m told it’s possible to get support working, at least with older versions of WMP). Similarly, most Mac users play audio with iTunes, which doesn’t come with vorbis support installed.

Personally, I don’t distribute music files, and can play them fine at home, but your situation may vary, so at least be aware of the problem. I expect that iTunes will eventually support Ogg Vorbis out of the box. Getting Microsoft to support such an open standard seems unlikely in the forseeable future, however.

Note that widespread acceptance (especially in hardware) is difficult and slow for any new format.
What Quality Should I Use?

To find out what quality level you should use to encode, you’ll need to do some listening tests. First, get one of your CDs and use a CD ripper (like EAC) to rip a track or two into a WAV file on the hard drive. This will take up some space, as mentioned before. If you’re comfortable with command-line tools, use oggenc to encode as specified below. If not, use a GUI tool like OggDrop.

Encode the test tracks using the default settings.

oggenc Track01.wav

This will be VBR, quality 3. Listen to the encoded tracks and decide how they sound to you. Do they sound good? Any complaints? If you think they sound fine, then encode all your music using the default settings and don’t even worry about bitrate, quality or anything. Your ears are the only metric that really matters.

If the files don’t sound good to you, or if they do and you’re just curious how much better they could sound, then try encoding them again at quality 4. Adjust the slider in OggDrop or issue the following command at the command line:

oggenc -q 4 Track01.wav

Then listen to these versions. Can you tell a difference in sound quality? If not, then there’s no reason not to encode at the default settings. If you can’t hear the difference, you’re only wasting space by encoding at a higher quality.

If you can hear the difference, is the increase in quality enough to justify the increase in filesize? If so, then perhaps keep increasing the quality by 0.5 or so until you can’t hear the improvement from the previous setting. Technically, the sound quality does continue to improve all the way up to quality 10, but almost no one can hear any differences after quality 7.

If you’re serious about sound quality, you might want to use ABX for your listening tests. ABX is a testing methodology that allows you to determine conclusively and repeatably if you are really hearing differences in two sound files. The PC ABX home page is a good place to get started.

Also, be advised that normal Ogg Vorbis files use lossy channel coupling, meaning that redundancies between the left and right channels are combined to save space. This does keep the files smaller, but also means that technically the stereo image of an Ogg Vorbis file might not always be identical to the original stereo image. If this concerns you, you’ll want to encode at quality 6 or higher, which is where the lossy channel coupling is turned off and all channel coupling is lossless. Most can’t tell the difference, but maybe you can.

On the other hand, if file size is more important to you than sound quality, try lowering the quality until the decrease in quality becomes significant enough for you. Some Ogg Vorbis users (with happily dull ears) report that they can’t tell the difference between the CD and an Ogg at quality 0! These people are saving quite a bit of space and still listening to music with a sound quality acceptable to them.

For those who want to listen to their music at home (where hard drive space is plentiful) and on a portable device where space is at a premium, encode in the higher format. Ogg Vorbis files can always be “peeled” later to get the lower quality versions from the higher quality ones.

A few people are using Oggs for streaming over the web. For these endeavors, variable bit rate is not acceptable because though the bitrate averages properly, bitrate spikes can exceed bandwidth requirements. A CBR mode and even maximum and minimum bitrate settings exist in Ogg Vorbis, but no details are given here because the techniques used to produce such files always result in worse-sounding files than a file of the same size using the default settings. The guarantees on bandwidth come at a price.

For the average user, if you are encoding Ogg Vorbis files using any encoding settings other than “-q n”, you are getting lower quality files than you could be at a given size.
A Note About “Transcoding”

Some people have a lot of music in mp3 format, but do not have the original CDs (*cough*). Others have the CDs, but spent months ripping and encoding all of them into mp3 format and don’t want to go through the trouble again. Such people are often tempted to take their mp3s, decompress them into WAVs, and re-encode them into Ogg Vorbis files. Some have even gone so far as to create tools to automate this process.

If you care about sound quality, you should never, ever do this. Ogg Vorbis uses similar but different techniques to remove information, and by transcoding, you lose information twice. Similar to faxing a photocopy of a fax, the “transcoded” ogg will always sound worse than even the original mp3.

Besides, for most users this isn’t necessary, as almost every player that supports oggs supports mp3s as well, and your mp3 collection can peacefully co-exist with your growing ogg collection. In fact, the only compelling reason to get rid of existing mp3s is ethical, not technical: either your mp3s were illegal copies or perhaps because you don’t want to support a patented format.

However, if your mp3s are of exceedingly high quality (say, 256 kbps or higher), then you probably haven’t incurred enough quality loss to worry about. They could probably be converted to Ogg Vorbis (especially at lower bitrates) with little additional loss of quality.
Reprinting This Article

Please visit http://grahammitchell.com for inquires.

This is from his original post:
Though I retain the copyright on this article, I grant permission to everyone to reprint this article in any form. You may translate it to another language, include it in a handout for a class, print it in a book, etc. However, you must credit me (Graham Mitchell) as the author, and you must include a link/reference to this original. You may do any of these things without otherwise contacting me. However, if you are going to do something like this, I’d appreciate it if you’d let me know. I like to keep tabs on how people are using this introduction.

If you’d like to do anything else with this article (substantial cuts for a shorter publication, make a derivative work, etc.), you must contact me for permission. Chances are extremely high that I’ll grant permission as well, but I’d like to decide on a case-by-case basis.
Version History

2003-06-25 – version 15
* added table of contents and tags so interested
readers can jump directly to a specific section
* fixed capitalization of section headings in places
* added the word “Ogg” to the title of this document
* added a link to Theora’s site (http://www.theora.org/)
* referenced and added a link to Speex’s site (http://www.speex.org/)

2003-06-03 – version 14
* reworded first paragraph under “Lossy Compression” for a clearer
explanation
* reworded third paragraph in “Just Say No to Bitrates” and added
information about portable manufacturers and their claims about how
many “minutes” of audio their products can store
* put quotation marks around “quality 3″ the first time it appears
(fifth paragraph of “Just Say No to Bitrates”) since the phrase
hasn’t been defined yet
* reworded first sentence in sixth paragraph of “Just Say No to
Bitrates”
* added a missing right parenthesis in fourth paragraph of “Why Ogg
Vorbis”
* added paragraph in section on transcoding admitting that “high bitrate
mp3 -> lower quality Ogg Vorbis really isn’t that bad

2003-04-18 – version 13
* added section on “Why Not Ogg Vorbis”
* more detail on how “peelable” vorbis files are
* fixed several things suggested by Moritz Grimm (tag names not
case-sensitive, 255 channels not 256, reference Theora instead of
Tarkin)
* clarified 10 dB vs. 3 dB for “doubling”

2002-09-22 – version 12
* added information on how article may be reprinted
* lossless channel coupling is now at q 6, not q 5, and there’s no
file-size spike anymore
* clarified that Philips and Sony didn’t invent digital audio
* added DOCTYPE, stylesheet

2002-07-29 – version 11
* corrected a spelling error (Thompson -> Thomson)
* Fixed Ogg quality range (now goes down to -1)

2002-07-22 – version 10
* clarified that 5.1 audio has five channels, not six (bass is computed)
* added clarification that one of my example lossy compression
techniques isn’t actually used in practice.

2002-07-11 – version 9
* reworded “peeling” clarification, now that 1.0 is gold.

2002-07-09 – version 8
* clarified that though oggs are peelable, no tools to do so yet exist

2002-02-11 – version 7
* reworded first sentence of “What Quality Should I Use” to avoid
preposition woes

2002-02-10 – version 6
* fixed two widdle typos (thanks, Dad!)
* %s/Kbps/kbps/g
* clarified that BSD license allows derivatives to be closed-source
* added paragraph on “Ogg” vs “Vorbis” vs “Ogg Vorbis” vs “oggs”
* added mention of PCM to “Digital Audio”, stealing from an email by Nemo
* reworded and moved a sentence from “Digital Audio” to “The Size Problem”
* mentioned that some DVD audio is sampled at 24-bit, 96 kHz

2002-02-09 – version 5
* completely rewrote first section, adding diagrams of analog and digital
waves, breaking “Why” into “The Sound of Music” and “Digital Audio”
and vastly improving it, in my opinion
* modified end of “Just Say No to Bitrates” to clarify that Vorbis doesn’t
target any specific bitrate when quality settings are used. Includes
large verbatim quote from an email by Carsten Haese
* capitalized section headers throughout
* added final paragraph to “Transcoding” section
* colloquial improvements: removed “folks” and first-person references
* reworded a few sentences ending in prepositions
* simplified sentence mentioning flac
* put list of topics into introduction

2002-02-08 – version 4
* changed “nearly 10 megabytes” to “approximately 10 megabytes”
* wrong word in 2nd paragraph of “ABR/CBR/VBR” (“give” -> “gives”)

2002-02-07 – version 3
* fixed several grammatical and other errors noticed by Philip M. White
* corrected section on -32768 to 32767

2002-02-07 – second draft
* fixed range of 16bit samples (-32768 to 32767 rather than 0 to 65535)
* added sentence on why 44.1 kHz was chosen for CDs
* added Sony’s name as developer of CD format
* wrong word (“Which” -> “While”)
* reworked CBR/ABR/VBR section with corrected info
* added section on “Why Vorbis”
* sound quality is high
* patent-free and open source
* nice features (tagging, peelable, more than two channels)
* added paragraph on ABX
* mentioned that lossless stereo coupling kicks in at -q 5
* put “transcoding” in quotes since it’s a misnomer

2002-02-06 – first draft

For more tech info, check out: http://grahammitchell.net