RaveBrainz: a MusicBrainz-compliant version of the RaveLibraryOnline

DerekFerric · March 26, 2022, 1:47am

After a lot of consideration, I’m totally rebuilding a group project by volunteers to archive the world’s live, studio and radio shows of all genres of music stemming from disco / house music circa 1980 to the present day. I’m going to add the sources for this data to this thread as an up-to-date index to refer to and link from and perhaps other MusicBrainz editors with skills inputting can assist in uploading all these tags to the relevant sections of the database form automatically. The Release Group can be set as Rave Library Online to distinguish this collection from other unofficial releases on Discogs etc. Thanks in advance to anybody who can volunteer to assist with this project. The main ambition is to help the world improve their own personal libraries with additional details as and when they are added to MusicBrainz and the user has scanned their own collection using Picard and the AcoustID technology.

Everything currently archived online following this standard can be found at the master folder:

The mirror folder also contains earlier versions of the same items when less details had been added or MP3Gain data hadn’t been added so when there’s multiple versions of the same show just look for the most recently uploaded version for the most completely tagged version. Some details may still be missing so this is a group effort.
All the individual source files in the online mirror archive https://archive.org/download/mediatimeline so far should be able to be streamed instantly by each of the following links or downloaded to your device by long-pressing (on touch screens) or right clicking (on mouse-cursor screens) the links and following the instructions to save.
All files are from the public domain but if promoters or creators want to include higher resolution versions in this timeline, give us a shout. This is intended primarily as an educational resource for people interested in the evolution of dance music from disco right up to the fusion genres of today. Focussing primarily on the rise of the turntablist as a performer in his or her own right but also including live performing artist shows if and when they can be included.

Edit:
RaveBrainz is just a working title to make it easier to simply look up via word-of-mouth.

DerekFerric · March 26, 2022, 1:47am

I’ve updated the original post so this message can be deleted.

DerekFerric · March 27, 2022, 10:08pm

I’ve updated the original post so this message can be deleted.

aerozol · March 27, 2022, 10:36pm

Do you have an example of how one of these should look in MB, all filled in?
Would be very useful to have a guide

DerekFerric · May 20, 2022, 1:19am

The %ALBUMARTIST% field will always be a straight copy of whatever is in the %ARTIST% field rather than “Various Artists” because certain default media players will not display who’s actually performing if you have an entire festival’s stages recorded.
All the other details goes in the %ALBUM% field made up from some temporary tags I set up for this task.
So the information in the ALBUM field is placed there by using TAG TO TAG in MP3Tag with the following string (if they are contained in the files, which they will be if you got them from here):
%DATE%: %PROMOTER%, “%EVENT%” %ARENA%: %VENUE%, %TOWN_CITY%, %STATE%, %COUNTRY%.
The details in %PROMOTER% should also be on MusicBrainz under %PUBLISHER%.

DerekFerric · November 11, 2022, 10:58pm

Since starting this thread, the master archive has grown somewhat… Over a quarter of a terabyte and counting, with an easier address at linktr.ee/ravelibraryonline which shhould now be the permanent bookmark until we acquire a domain which can stay up forever…

I’ve also embedded new tags in the files with information such as venue, town/city, state, country and even less important details like which set, in the case of DJs who play twice at one rave, which might come in useful once some script or plug-in whizzkid can help us import all this metadata in bulk…
Feel free to download a bunch of the most lengthy named files and open in MediaInfo or whatever to inspect what I’ve taken a rather unhealthy amount of time doing in batches!
As I probably said earlier, in the absence of clear information about sources of many of these rips (broadcast / livestream / CD / cassette / USB etc etc), the fastest way for us to get the relevant information into the tag fields is to treat them in the same way you guys treat podcast series recorded live, so you should hopefully see the similarities… Of course when all this is inputted to the main database it should all be treated as Unofficial releases, at least until it can be tied to any pre-existing releases already up there.

ARTIST and ALBUM ARTIST:
<DJ(s) or PA(s)>[ feat. <MC(s) or Vocals>]

ALBUM and TITLE:
<Date (which should default to Released Date on Discogs)>: <Event Organiser or Network Station (which should default to Record Label on Discogs)>, “<Event or Show (which should default to Album on Discogs)>”[: Venue][, City][, State][, Country]

FILENAME also includes the bitrate in kilobytes per second and the file size in bytes (to make it easier to compare files at a glance as similar or absolutely identical) for archiving purposes but this info won’t need importing to the MusicBrainz database so much.

I use MP3Tag to construct filenames based on separate tags for those variables which I embed (and can often be lost once imported into media managers, please take note).

I’m looking to learn how to write code in Python but I don’t have as many braincells as I would like after listening to 30 years of breakbeat hardcore jungle techno so this could take a while. Your help is greatly appreciated MetaBrainz! Thanks and wishing everybody the best of health in these dark times.

DerekFerric · November 11, 2022, 11:05pm

Does the information above make sense to anybody apart from my brain? LOL! Hope you’re well.

aerozol · November 12, 2022, 8:31am

Hah, I’m good thanks, hope you are well! Looks like you’re having fun with your archive

I’m not completely sure from this what your plan is, is it to get it all in MB? Or are you sharing in case other people with similar interests want to or can put it in?

If you 100% fill out a MusicBrainz entry/release and then link it here I would still be interested in having a look. Though that’s not to say I have time to pitch in, life gets busy…

DerekFerric · May 19, 2023, 7:24am

The plan is indeed to get all the files into MusicBrainz, or at least the information from the ID3 tags, creating a comprehensive collection for anyone with similar interests to enjoy. I’m aiming to organize the data and share it with the community so that we can collectively enhance our music libraries. Once I’ve completed a MusicBrainz entry with all the details filled out, I’ll definitely share it with you. I understand that life can get busy, so no pressure to contribute, but your input and insights would always be appreciated. GPT has come up with some suggestions. Here’s what he/she said:

Yes, you can indeed create a script to extract data from your locally stored files and automatically input them into the MusicBrainz database. Here are some suggestions to help you with this task:

File Parsing: Write a script in a programming language like Python to parse the metadata from your audio files. You can use libraries such as Mutagen or EyeD3 to extract information like artist, album, track title, and other relevant details.*
MusicBrainz API: Utilize the MusicBrainz API to interact with the MusicBrainz database programmatically. The API allows you to create new releases, add track information, and update existing entries. You’ll need to authenticate with MusicBrainz to access certain features of the API.*
Moderator note: the API does not actually allow creating new releases, see further posts.
Data Mapping: Map the extracted metadata from your files to the corresponding fields in the MusicBrainz database. Ensure that your script correctly matches and populates the relevant MusicBrainz fields based on the information extracted from your files.*
Batch Processing: Implement a batch processing mechanism in your script to handle multiple files efficiently. This will enable you to automate the process of uploading data from a large number of files in a streamlined manner.*
Error Handling: Consider implementing error handling mechanisms in your script to handle cases where data extraction or API calls encounter issues. This could involve logging errors, retrying failed operations, or skipping problematic files while continuing the process.*
Testing and Validation: Before executing your script on a large scale, perform thorough testing on a smaller subset of files to ensure the accuracy and integrity of the data being uploaded to MusicBrainz. Validate the results and make necessary adjustments to your script as needed.*

Please note that working with APIs and scripting can be complex, so it’s advisable to familiarize yourself with the MusicBrainz API documentation and programming concepts beforehand. Additionally, be mindful of MusicBrainz guidelines and best practices to ensure that the data you contribute aligns with their standards.

DerekFerric · May 24, 2023, 1:17pm

Not sure if you got to see my reply, @aerozol but part of the reason I posted the information there that many of you tech-heads may have already known, regarding what I need to do to accomplish the task I’m taking on with our external Rave Library project, I should just clarify that I’m sharing the link to this thread in other locations online for other members of my community to benefit from. The details now hidden were news to me, as somebody not advanced at scripting in the hope that they might inspire my readers elsewhere to offer to volunteer to help me in creating some sort of plug-in to gift back to you guys here as a contribution to the project but also to help me get stuff done to be more effective with my time. If it needs to remain hidden, so be it but it wasn’t off-topic in my opinion. If it was a post about cryptocurrency, I would understand the need to nuke it. I’m trying to keep everything useful in this one thread.

outsidecontext · May 24, 2023, 1:28pm

I think part of the problem with the post is that the information given by ChatGPT is wrong. Specifically “The [MusicBrainz] API allows you to create new releases, add track information, and update existing entries.” is not true, as the API does not provide the claimed functionality.

ChatGPT always presents its results with much confidence. But publishing this unfiltered can spread wrong information easily.

On a positive note I liked how ChatGPT included some words of caution to respect MB guidelines and best practices and to validate any data before submitting it

reosarevok · May 24, 2023, 1:47pm

I unblocked this post since this seems like a legitimate, announced and non-confrontational use of ChatGPT. That said, as @outsidecontext said, the LLM is confidently claiming false information, which is problematic since it might confuse other users who see your post Now that the thread makes the situation clear, I’m ok with leaving the post visible - I’ll add a small note to it though.

Please let’s not have further debate about forums and LLMs here though and let’s stick to the RaveBrainz topic - further discussion if desired can move to its own post

aerozol · May 25, 2023, 12:21am

I think the MB community appreciates the sentiment and that you’re here to help out @DerekFerric, and I hope you didn’t feel too put down when other users and I said we don’t want too much AI content on the forums (unless applied with care).

In this case, sharing stuff from ChatGPT that may or may not be true (without taking time to check) is worse than not sharing anything at all, IMO. Even with the disclaimer that it was written by chatGPT, search tools can index the incorrect information and feed it back to other users/the web/new AI being trained… so it’s good that @reosarevok took the time to add a correction.

This is why we’re so lucky to have our wonderful tech-heads here, who take the time to write helpful posts and answer questions like yours, even when they could save time by telling people “just chatGPT it”