How do _you_ add a Release? What's your workflow?

ripping
workflow
ux
scanning
picard
musicbrainz
Tags: #<Tag:0x00007fe3d48c2638> #<Tag:0x00007fe3d48c23e0> #<Tag:0x00007fe3d48c2200> #<Tag:0x00007fe3d48c1e68> #<Tag:0x00007fe3d48c1b70> #<Tag:0x00007fe3d48c18f0>

#1

I’m curious, how do you add a Release? What’s your workflow?

I’m about 80% of the way through adding 500 Releases: mostly CD, mostly classical or opera, many by local artists with no trace yet in MusicBrainz. My goals are: 1) to have well-tagged music files I can listen to from my devices, and 2) to have an archive (backup) of my physical media. I’ve been working on it for for 10 years now.

I’ve changed tools and workflow many times. About 2 years ago I started writing down my workflow methodically, so that I didn’t have to re-learn everything each time I restarted the project. My Music Ingestion Process document is now at 50 pages. I thought it would be interesting to summarise my workflow, and hear from other editors what workflow they use.

My workflow has seven major steps. I have a box with 7 dividers, and some spindles with loose CDs, where I keep the work-in-progress. Each divider holds the CDs awaiting that step.

  1. Evaluate release. When I first get a new Release, check in MusicBrainz to see how many of the following steps are adequately enough done in MusicBrainz that I can skip them. Pop music from well-known Artists jumps to nearly the last stage. My latest Jocelyn Morlock CD probably has to start at step 2. I might assign a Disc ID to an existing Release entry as a by-product of this step.
  2. Populate artists in MusicBrainz. Make sure there are Artist entries for each Release Artist, Track Artist, and Composer. Sure, I can add them while adding the Release, but it’s awkward. Better to do it up front.
  3. Add Release, with variant directions for CD media and non-CD media. This includes creating the Release entry, and assigning the Disc ID.
  4. Improve existing Release entry. I have a time and patience budget for each Release. If it wasn’t consumed by adding a new Release entry, I spend it improving the Release entry that someone else already added. There is almost always something to do. Often, it’s adding Relationships.
  5. Scan cover art, media, other packaging. Upload it to the Release. I want to do this before I rip the CD, to give the Cover Art Archive time to process it for my ripping tool to use.
  6. “RIP, Tag, Move”, for the multiple steps RIP CD or other media to an archive, extract single-track audio files from the archive, apply tags to the audio files, and move them to the file server from which I can play them. 10 years ago these steps took a long time. Now, these usually take little enough time that they can be done together.
  7. “Depackage”, remove the liner notes and artwork from the case, store the CD on a spindle, and file the artwork in a storage box. Box sets and digipacks, which don’t disassemble, are handled is a different but similar way.

I plan future steps. Particularly, I want a way to analyse my collection for Releases which aren’t up to my best standards: missing cover artwork, lack of Composer tags, no proper archive of the CD itself, etc. Over time, I want to improve the entries so that they are all at a high standard. But that’s a different workflow, and a different set of tools.

I also value the side-effects of this work: participating in the MusicBrainz community, improving the data, documenting my processes and problems in the hope of helping others, and so on.

That’s my workflow. What is yours?


New editors: a guided, staged, self-training approach - LONG
#2

I’ve probably written up parts of this before (and no doubt in more detail), but…

Background: I use Linux, Firefox with several extensions, and git-annex. Almost all of my MB work is CDs, and almost all of it is classical (defined broadly). Rips are done to FLAC.

It’s nowhere near 50 pages, but…

Phase I

  1. Put CD in drive, see have Picard look for it by disc ID. See if it finds something. If so, take a quick look at the quality — if not, try to find it by artist (typically).
  2. Decide if I want to deal with this one now. If not, put it in the “later” pile, grab the next disc, back to step 1. (An important step — some releases take many hours to enter in; e.g., a box set where none of it is in the database).
  3. If not already in the database: Create a new release from the Picard discid search, so the discid will be auto-associated. I don’t pre-create artists, I just create them when entering the release (yeah for tabs). I enter the track list using the ‘parse tracklist’ button, editing it with vim via the ItsAllText extension.
  4. If the release matched by discid wasn’t the right one (e.g., different release in same release group), find the right one. Or go to step 3 and enter it.
  5. Run my do-rip script passing the album MBID, which runs morituri with the MBID after some sanity checks (e.g., that the rip is going to the right place)
  6. If it’s already in MB, look for any mistakes in the track list, and fix them. Except for the most trivial stuff, I again use the parser + ItsAllText + vim.
  7. Make sure to add the release to my “owned” collection
  8. At this point, the release goes in to the to-scan pile. Unless I decide to scan it immediately. But if not, I’ll move on to another release, leaving this one’s album page open in a tab.
  9. At some point morituri finishes, run the ingest-rip script on the directory. This checks the AccurateRip status, and move it to an appropriate directory based on if it passed or not. It also does a git-annex add and git commit on the new files.

Phase II

Typically started when the to-scan pile has grown enough.

  1. Turn on the scanner (a CanoScan 9000F Mark II). Important first step…
  2. Start gimp, start up the scanner plugin (xsane)
  3. Grab the first album off the to-scan pile, take the booklet out of it, scan the front (1200ppi, color management enabled, convert to sRGB). When that finishes, start the descreen and as that’s running scan the next page (300ppi if text-only, 1200ppi if color pictures that’ll need descreening).
  4. Create directory in git-annex to hold resulting artwork (dir name = album MBID)
  5. Check the descreen results, try again with different parameters if too many artifacts or screen pattern too apparent.
  6. Once happy, downscale to 300ppi and export to JPEG.
  7. Repeat descreen with the rest of images that need it. For B&W text, instead do levels/curves to get text black & paper white. For B&W, convert to grayscale.
  8. For some things, have to crop out a picture and descreen it separately, then paste it back in (e.g, process color image next to black text)
  9. Some scans require disassembling the jewel case. I’ve gotten pretty good at that…
  10. For some releases that need special handling (foil), make a note to photograph. Note made by adding to a (private) collection on MB.
  11. Once release fully scanned, git-annex add it and commit… And upload to CAA. Then git mv the art directory to the uploaded directory. Commit the move.
  12. Check the accuraterip status (from back in Phase I). If it passed, it goes in the “done” box. Otherwise, it goes on the shelf in the to-verify pile.

Phase IIIa

Phases IIIa and IIIb can be done in any order.

  1. Grab the discs from the to-verify pile, take them to another machine (which also has the git-annex repository checked out and sync’d).
  2. On that other machine, switch in to the verify folder. Insert a disc.
  3. Run do-rip, it’ll notice it’s in the verify folder and handle it.
  4. Run verify-rerip on the resulting rip. That finds the original rip, and compares the checksums vs. the rerip. If they all match, it annotates the original rip’s log, commits that to git-annex, and moves the original rip out of the need verify folder. It also adds a copy of the verifying rip log.
  5. If verify fails, deal with it by hand. Possibilities include trying a third drive, trying EAC (via Wine), and manually comparing the audio (from the multiple rips) to see if I care.

Phase IIIb

Phase IIIa and IIIb can be done in any order.

  1. Pull up the cover art for one of the releases.
  2. Using the data in the cover art, add ARs (using the AR editor) to the release for everything in the booklet that can be represented in the MB schema. This is also when I link works (creating any needed).
  3. I personally put in the recording date on the recording/place AR and (now that the script exists) copy it to the other relations from that. If there are multiple dates that aren’t just a range, I put a range covering them all, copy that, then add multiple recording/place ARs to specify the dates exactly (but I do not duplicate all the other ARs because that’s silly).
  4. Every bunch of ARs, I save and then re-enter the editor. Maybe it’s fixed now, but it used to be that if you did too many at once, it’d time out and your work was lost.
  5. Anything important that I can’t fit in the MB schema goes as an annotation.
  6. Double-check everything looks good. If it does, use the set recording artists script to fix the recording artists.
  7. Put in an edit to change the data quality to high.
  8. Check open edits. If there are any I care about for tagging, move the tab to the “wait edits” tab group. (Tab Groups extension for Firefox)
  9. If the release is verified (either by re-rip or originally by AccurateRip), then go to Phase IV. Otherwise, move the tab to the “wait verify” tab group.

Phase IV

Phase IV is where the files actually get tagged. Tagging is a really expensive operation, since git-annex does not delta-encode the different revisions. Each revision is stored in full (I plan to fix this someday).

  1. Make sure git-annex sync --content has been done before tagging, so that the untagged originals are copied where they need to be. Otherwise its an annoyance to fix.
  2. git-annex edit the files.
  3. run flac-replaygain-group on the release. This is a script which runs replaygain with a special definition of album, with album being one complete work (e.g., a symphony). So there are often multiple “albums” on a release. It isn’t a very smart script, it works by asking me where the breaks are (by having me edit a file listing in gvim, adding - to mark the breaks). It does at least run the multiple replaygain calcs in parallel.
  4. Run Picard on the files. Make sure it’s got the right release. Make sure that release is in the owned collection. Picard is set to download all album art, original size, and embed it. Confirm replaygain tags in Picard are right.
  5. Hit save in Picard on the album.
  6. Drag the saved album back over to unmatched
  7. Hit scan.
  8. Check the files AcoustID put in place to make sue it didn’t do anything stupid (e.g., tracks in wrong order). If it did, fix the AcoustID data on the recordings.
  9. Drag the files AcoustID didn’t match over to the album. Make sure Picard didn’t do anything stupid (e.g., tracks in wrong order). Submit the AcoustIDs.
  10. Save again, so AcoustID tags are in the files.
  11. Run rip-picarded script which checks a bunch of things (MB tags present, AcoustID present, rip verified, rip log exists). Basically this confirms all the steps have been done. Also, it checks for a track 0, and if found checks if it is silent, and if so removes it. Then it git-annex adds the files, and moves them to the “picarded” directory".
  12. Look at the release tab. Confirm that it’s in the “owned” collection.
  13. Finally, close the browser tab.

Phase V

When done Picarding releases.

  1. git-annex sync --content again
  2. Run mpd-update which has MPD update its collection data.

#3

Great workflows, thank you for sharing it.

What I miss:
How exactly do rename your releases, tracks? How do you organize them (do you use software like MediaMonkey, MusicBee etc to quickly find and play your songs)?


#4

I was trying to give the 1-screen summary, not the full 50 pages :grinning:.

I use Picard to tag and rename audio files, though I expect to switch to beets. The audio files end up in a big directory tree on a file server. An old Linux laptop running Banshee plays them from there, through the stereo amplifier. And I convert them all to MP3, and copy tailored subsets of the MP3 versions down to phones, iPads, etc for listening on the go.


#5

I mostly leave the names that morituri gave them (which is based on the MB data, though often before I edited it — so sometimes wrong). I think these are basically morituri defaults…

Format is release artist - disc title (disc number/total if multi-disc). So winds up looking like:

$ ls /srv/music-annex/rip/picarded/ | head
00_dl_processed
Aaron Copland; Detroit Symphony Orchestra, Leonard Slatkin - Appalachian Spring (Complete Ballet) - Hear Ye! Hear Ye!
Aaron Copland; London Symphony Orchestra, Columbia Symphony Orchestra, Aaron Copland, William Warfield - Copland Conducts Copland - Fanfare for the Common Man - Appalachian Spring - Rodeo - Old American Songs
Alan Hovhaness; Greg Banaszak, Eastern Music Festival Orchestra, Gerard Schwarz - Symphony no. 48 _Vision of Andromeda_ - Prelude and Quadruple Fugue - Soprano Saxophone Concerto
Alan Hovhaness; Javier Calderón, Royal Scottish National Orchestra, Stewart Robertson - Guitar Concerto no. 2 - Symphony no. 63 _Loon Lake_ - Fanfare for the New Atlantis
Alan Hovhaness; John Wallace, Royal Scottish Academy of Music and Drama Wind Orchestra, Keith Brion - Symphonies nos. 4, 20 and 53 - The Prayer of Saint Gregory
Alan Hovhaness - Mysterious Mountain etc.
Alan Hovhaness; Seattle Symphony, Gerard Schwarz - Mysterious Mountain - And God Created Great Whales
Alan Hovhaness; Seattle Symphony, Gerard Schwarz - Symphony no. 1 _Exile_ - Symphony no. 50 _Mount Saint Helens_
Alan Hovhaness;Seattle Symphony - Symphony No. 22 _City of Light_ - Cello Concerto

Sometimes that results in a disc title > 255 characters, which is the limit on ext4. So those get shortened by hand (“Mysterious Mountain etc.” above.).

File names are tracknum. artist - title, e.g.,

$ ls
'01. Aaron Copland - Hear Ye! Hear Ye! - Scene I.flac'     '12. Aaron Copland - Hear Ye! Hear Ye! - Scene XII.flac'
'02. Aaron Copland - Hear Ye! Hear Ye! - Scene II.flac'    '13. Aaron Copland - Hear Ye! Hear Ye! - Scene XIII.flac'
'03. Aaron Copland - Hear Ye! Hear Ye! - Scene III.flac'   '14. Aaron Copland - Hear Ye! Hear Ye! - Scene XIV.flac'
'04. Aaron Copland - Hear Ye! Hear Ye! - Scene IV.flac'    '15. Aaron Copland - Hear Ye! Hear Ye! - Scene XV.flac'
'05. Aaron Copland - Hear Ye! Hear Ye! - Scene V.flac'     '16. Aaron Copland - Hear Ye! Hear Ye! - Scene XVI.flac'
'06. Aaron Copland - Hear Ye! Hear Ye! - Scene VI.flac'    '17. Aaron Copland - Hear Ye! Hear Ye! - Scene XVII.flac'
'07. Aaron Copland - Hear Ye! Hear Ye! - Scene VII.flac'   '18. Aaron Copland - Hear Ye! Hear Ye! - Scene XVIII.flac'
'08. Aaron Copland - Hear Ye! Hear Ye! - Scene VIII.flac'  '19. Aaron Copland - Appalachian Spring (Complete Ballet).flac'
'09. Aaron Copland - Hear Ye! Hear Ye! - Scene IX.flac'    'Aaron Copland; Detroit Symphony Orchestra, Leonard Slatkin - Appalachian Spring (Complete Ballet) - Hear Ye! Hear Ye!.cue'
'10. Aaron Copland - Hear Ye! Hear Ye! - Scene X.flac'     'Aaron Copland; Detroit Symphony Orchestra, Leonard Slatkin - Appalachian Spring (Complete Ballet) - Hear Ye! Hear Ye!.log'
'11. Aaron Copland - Hear Ye! Hear Ye! - Scene XI.flac'    'Aaron Copland; Detroit Symphony Orchestra, Leonard Slatkin - Appalachian Spring (Complete Ballet) - Hear Ye! Hear Ye!.

I currently use Clementine and MPD (running on Raspberry Pi, outputting digital over HDMI) to browse/play the collection; neither of these care about how the files are organized. I use ncmpcpp (on the Pi) and M.A.L.P to control MPD. None of these are perfect, and someday I’ll probably give in and write my own collection browser…

I also have BubbleUPnP running, which I can use to stream music to e.g., phones. Or to other UPnP devices (and actually have an OpenHome renderer on the Pi as well, talking to MPD).


#6

I’m adding my collection of ~800 LP’s, ~500 CD’s and a couple hundred digital downloads.

I usually work in batches of 5 or 6.

  1. First, I check MB to see if my releases exist. If they do, I add them to a collection called “Working”. If not, I look for a discogs page or something and add it to a bookmark folder called “Working”.
  2. I get images of the packaging by scanning the CD’s or photographing the LP’s. I get all the packaging I have available, not just the covers.
  3. I go back to MB and add those releases that aren’t in the database, or add cover art to those that don’t have it. At this point, I’m only adding the front covers. For the releases that exist, check for typos and such.
  4. Rip the music to .mp3. I use Foobar for the CD’s and Audacity for the LP’s.
  5. Use Picard to tag the ripped music into a folder called “First Listen”.
  6. Listen to the music to make sure the rip succeeded.
  7. Go back through each release and fully process them one by one.
    a. Process and add all the cover art. I use Paint Shop Pro for post-processing.
    b. Add anything that needs to be added in the release editor: Works, AR’s URL’s, etc. If there’s existing AR’s that contradict what’s in the liner notes, then research and correct if necessary.
  8. Move the folders into my main collections and add to Music Bee.
  9. Go choose the next batch and repeat. I usually alternate between CD’s and Vinyl.

I rename them as follows:
…Tagged Music\~D~\Dylan, Bob\Planet Waves\01 - Bob Dylan - On a Night Like This.mp3

Under the root, I also have separate folders for Various Artists and Soundtracks.

On my computer, I use Music Bee. I use Plex to play music on my entertainment center. On the go, I have Music Bee dump a random selection of songs from my main playlist onto my phone. That usually lasts about a month. I use GoneMad on my phone.


#7

Basically, I check with isrcsubmit.py first to make sure DiscID and ISRCs are in MB (if relevant), if release doesn’t exist at all, I usually do a cursory search in other systems (libraries, Discogs, etc.) to see if it exists elsewhere for easy importing, but add the basic release from there and rerun isrcsubmit.py. After that I judge whether I want to add all relationships for the release and once I’ve either added those (or decided to skip them), I run my rip-disc script which takes care of ripping using whipper (the successor of morituri), calculating replaygain and recompressing using flac, initial tagging with Picard, analysing and submitting to #acousticbrainz, and compressing and stowing away with 7z and rsync.

If I want to actually listen to the music at some point, I usually extract the .7z and import into my beets library. I usually play using Kodi but I want to poke at mopidy again soon. (I play using Vanilla Music on my phone.)

I only very occasionally scan, and when I do I only do a basic (high res) scan and upload that to CAA, leaving for others (or myself at some later day) to prettify the scans.

(Shameless self‐promotion: I sometimes live stream my adding and editing of data, so you can see my workflow in action in my stream archives:


(May or may not eventually make it to YouTube as well.))