Dump Released on Internet Archive

For any digital archaeologists out there, this might be interesting.

The Internet Archive, namely Jason Scott, has been given a “dump” of’s digital audio files before their CNET take-over in late 2004. This “dump” contains a large amount of early MP3 files from various independent musicians.

The dump has been “de-duplicated” and roughly sorted by filename into various chunks.

I am of course downloading a chunk to see what results turn up in Picard! :smiley:


Nice finding, thanks for letting us know.

Did you read the note:

… As a result, descriptions and information are scant about the tracks and artist information was not wrapped into the MP3s themselves. While they are all playable, very little else is known about them. They have been put into a set of relatively large multi-track sets, but none of the tracks are necessarily related to each other.

Please let us also know how AcoustID and/or Picard recognize this tracks.

1 Like

Well so far in the Internet Archive Discord, Jason’s done a few packs and we’re getting less than maybe 1% detection rate… these tracks are SUPER RARE - so looks like I’ve got some more research and adding to the database ahead of me :cold_sweat:


I have downloaded the part “A-2” (18 GB) with 4’996 songs.
These mp3 only have a quality of 128 kbit/s CBR and only contains an ID3v1 tag.
Embedded are the Artist name and Title and in the comment an URL to the old website.

There is no information about Album, release Year, Genre and no Cover is embedded.

The only thing I currently see, is to search for additional information in the wayback machine:

Unfortunately, the embedded URL’s I tried (like are no more available in the wayback machine…

Any other ideas?

I’m very tempted to download a pack and see what those mp3s are :grin:

Go for it! I’m diverting at the moment and filling in the database with bands / artists that aren’t already here, in the hopes that we can match some of these thousands of digital tracks to some one.

Ultimately no; it’s a case of going extremely hard with searches, using other databases like Discogs,, AllMusic etc. to see if they have any records. It’s all fun, but extremely time consuming.


Jason has just located this:

This contains a large amount of metadata that is likely missing from WayBackMachine; but keep in mind its 10GB of HTML files

1 Like

What is their Discord server link?

1 Like

So I’m currently working to see if I can get a version of this dump functional so that people can access it over the internet… watch this space…

1 Like

An update for those concerned.

I have been a busy beaver, reconstructing the website dump from that time and have released it as a public project

This will allow you to browse through the web-site artist pages and link them back for evidence when submitting to MBz!


Hi - there are still a lot of us around who worked on the code, so if there are particular questions you need answered, ask away.

As far as I remember we never stored anything other than 128kbit tracks; storage was expensive so it wasn’t like there were archives of better bit rate masters somewhere.

What you really want are the YML files that we used to build all the artist pages. (Or the mysql backups, but that’s even more unlikely) Unfortunately, I don’t think we ever served those from web servers; they were used to statically build pages, but we didn’t normally have javascript that fetched data.


Hi James

Very cool - I’m still working on this here and there when I get a few moments to spare.

What IA have are the final resultant/rendered HTML pages, and not all of them.

Anything overly dynamic (like genre pages) are gone, but artist pages, artist-info and song-info pages seem to be around.

We also seem to have a lot of the cover art (albeit in classic early internet low resolution) for the tracks.

My next goal is to gather a list of the artist images that are missing, and “touch” the Internet Archive’s WayBackMachine to see if they saved any of them. Get them downloaded and uploaded.

After that, the next thing would be to look to seeing if we can link the tracks in the MP3 Rescue Barge back to the website that I’ve got… this should allow people to find the MP3’s a bit more “naturally”.

Ultimately, my goal is to have this up and running in such a way that I can then start to add the songs in the rescue barge into the MBz DB.

1 Like

You mean that the audio tracks on DAM CD were burnt from the same 128kbit MP3 files?

DAM CD were CD-R with audio tracks then data session with same tracks as MP3 files.

1 Like

Yes, the DAM CDs (in theory, stood for “Digital Audio Music”…) were the same 128kbit tracks as far as I remember. I mostly worked on my.mp3 though, not the DAM stuff.

In the very early days when the office was on the General Atomics campus, there was a bell that rang every time a DAM CD was sold.


Just reminds me of this scene :laughing:

1 Like

Haha, my kind of workplace!!


(the lyrics, if anyone’s interested)

1 Like

if you haven’t seen it (the show) - i highly recommend

1 Like

First of all a giant thanks for everyone chipping in, especially the 2003 project. Coincidentally, I found this trying to piece together more information on the rescue barge and thanks to the 2003 project I was able to find out more information on some of the mp3s I’ve been pulling aside and tagging. Exciting! My question is (and maybe no one knows) … “F”, “O”, “R”, seems to be missing from the dump, does anyone happen to know if more is coming or did Jason contribute all that he had. Again, thank you everyone, I’m really glad I found this thread!

1 Like