Hi everyone,
I wanted to share a project I’ve been working on called Retreivr. The core idea is pretty simple: use MusicBrainz as the canonical source of truth to build a clean, fully tagged local music library.
Most tools in this space either pull metadata from multiple sources or infer tags after the fact from whatever files they downloaded. I wanted something that flips that around — metadata first, acquisition second.
The workflow is roughly:
- Search for a track, album, or artist using MusicBrainz data.
- Select the canonical item (track, release, or release-group).
- The system resolves the full MusicBrainz entity graph.
- It then acquires the audio from public sources.
- Final files are written with rich embedded tags including MBIDs, release info, track/disc numbers, and ISRC when available.
The goal is deterministic results and libraries that stay consistent over time, instead of accumulating messy or conflicting tags.
A few design choices that may be relevant to this community:
- MusicBrainz is the only canonical metadata source.
- Searches and queueing are based on MBIDs rather than free text.
- Final files embed MusicBrainz identifiers so the library stays resolvable later.
- Releases are processed using proper release/release-group relationships from MB rather than guessed album data.
Right now the project is still early (v0.9.x), but the search → acquisition → tagging pipeline is fully working end-to-end. I’m able to resolve 90%+ of the tracks in full albums attempted.
I’d really appreciate any feedback from the MusicBrainz community, especially around:
- best practices when resolving releases vs release-groups
- metadata fields that are particularly important to embed
- things I might be overlooking when relying heavily on MBIDs
Project repo:
Release Page:
Thanks to everyone involved in MusicBrainz — the project wouldn’t exist without the data ecosystem you’ve built!!!
- Logan
(loganbuilt (Logan) · GitHub)