Hi everyone,
I’m the Chief Engineer for a high school radio station, and we have a large MP3 library (~800GB) that needs some serious cleanup. Most of the files have incorrect, missing, or inconsistent metadata, and we also have lots of duplicate songs.
I’m looking for advice on the most efficient workflow to:
Fix and standardize metadata using MusicBrainz Picard (or any other recommended tools)
Identify and safely remove duplicate files while keeping only one version of each song
Handle special cases like compilations, radio edits.
Automate the process as much as possible to avoid manually correcting thousands of files
Our goal is to end up with a clean, accurate, and professional library that’s ready for broadcasting, with minimal manual work.
Any tips, workflows, or plugin recommendations would be greatly appreciated!
Firstly, you need to decide whether this is going to be easy but time consuming, or extremely difficult and extremely time consuming.
- Are they albums or single tracks?
- What metadata do they currently have (inc. directory and file names as well as tags)?
If you have full album releases, already grouped into directories with half decent names and perhaps some tags as well, then Picard can probably make a decent guess and correcting the guess if it gets it wrong won’t be too difficult. Use the Cluster and Lookup workflow described in the documentation.
But if they are random tracks unnamed, no tags etc. then Picard will need to fingerprint each track and make a guess based on that, and you might be old and grey before you finish (assuming, that is, that as a Chief Engineer you aren’t already old and grey). Use the Scan workflow described in the documentation or try to source CDs and rip them anew (and perhaps at a higher bitrate).
Picard doesn’t do deduplication - but if you set it up right it will put the duplicates in the same directories and you can e.g. search for files with (1) in the name or simply eyeball the directories.
You will need to set up a decent file naming script to give structure to your library. For the purposes of deduplicating you might want to add the bitrate so that you can delete the lower bitrate version.
3 Likes