MP3 Sucks

2003-07-20T19:59:00Z

I mostly created this account as a forum for me to complain to the world so that everyone could hear me. So here is my first complaint. I hate the MP3 ‘format’.

The format is just a bunch of compressed audio frames. That’s it. No header to tell you useful stuff like how long the file is. If you know the length of the file, and the compression rate is constant throughout the file, then you can calculate the length of the decoded audio from the compression rate given in the first frame. There is perhaps a small problem with an ID3v1 tag at the end of the file that could throw off your calculations. But perhaps that isn’t such a big deal.

However, not all files have constant compression rates throughout the file. There exist VBR files. The Xing people who seem to have started this VBR thing were kind enough to place useful information in the first audio frame of the file. The information includes the number of audio frames in the file, and from this you can calculate to total decoded length of the file. Unfortunately this Xing header doesn’t seem particularly well documented. Most documentation that one finds on the web will tell you that the position of the Xing header within the first audio frame varies depending on the audio parameters of the file. For the most common audio parameters it occurs 36 bytes after the start of the frame. Now from what I been able to gather, MP3 audio frames can contain ancillary data within them. The Xing header isn’t at some random place. Rather it is at the beginning of the ancillary data. For a silent frame the ancillary data begins after the side information block. The length of the side information block depends on the audio parameters. So this is why the Xing header moves around depending on the audio parameters.

If documentation had said this, the world would be a better place. But they didn’t, and let’s look at the consequences. I’ve been trying to parse Xing headers so I can calculate the length of VBR files. Here is the beginning of the first frame of the second test file that I used.

0000110: 0000 0000 00ff fb30 4c00 0000 0000 0000  .......0L.......
0000120: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000130: 0000 0000 0000 0000 0058 696e 6700 0000  .........Xing...
0000140: 0f00 0016 6900 37b3 d200 0103 0508 0a0c  ....i.7.........

The frame begins at 0x115 (before that is the ID3v2 tag). The MP3 frame header is 4 bytes long, containing 0xfffb304c. Then we have 32 bytes of 0x00, and at 36 bytes after the beginning of the frame the Xing header appears. This is just as we expect. Now look at the beginning of the first frame of the first test file I tried using

00010f0: 0000 0000 fffa 9064 861f 0000 0000 0000  .......d........
0001100: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0001110: 0000 0000 0000 0000 5869 6e67 0000 000f  ........Xing....
0001120: 0000 1bbd 003c f125 0003 0608 0a0e 1012  .....<.%........

The frame begins at 0x10f4. The MP3 frame header is 4 bytes long containing 0xfffa9064. In this case there are 2 bytes for the CRC. And again 36 bytes after the beginning of the frame there is the Xing header, just like the ‘documentation’ said. Only problem is that the Xing header is supposed to begin after 32 bytes of 0x00 (the side information), which in this case should be 38 bytes after the beginning of the frame. So when I correctly parse the ancillary data for the Xing header, I don’t find it. I assume the file has a constant bitrate, and I miscalculate the time of the file, and everyone is upset that the time displayed for the length of the file is wrong.

This is just one of the many ways that MP3 files are often broken. Some have RIFF WAVE headers, etc. Seeking to a location in a VBR file is next to impossible. There are various patents on various MP3 decoding and, in particular, encoding technologies. I really wish everyone would instead use Ogg Vorbis.

So I find it odd that people like dink seem to strongly dislike Ogg Vorbis, and prefer MP3. I don’t understand this at all. But perhaps dink is more of an audiophile, and the compression quality of MP3 is superior to that of Ogg Vorbis. I don’t know. But I don’t see how that could possibly make up for the intense pain of dealing with MP3 and their patents.

Tags

,

Russell O’Connor: contact me