Indie Artist Bot

Hi all – I’m a music enthusiast with a vision that I wanted to propose and get grounded in the greater community here.

For decades, the best resource music explorers have had at their disposal for discovering new music is some form of “if you liked this artist, you might also like these”. Whether you’re sifting genre bins and getting recommendations from fellow enthusiasts at the record shop, flipping through “People Who Bought This Album Also Bought These” lists at online retailers, or spending hours diving to the bottom of every “Related Artist” tree on modern streaming services, even sites like Gnoosic or Music Map (which I respect deeply and have used extensively) will show you the same static and unchanging content over and over again, by stereotype. If you’re a Led Zeppelin fan and want to find modern bands that sound like Led Zeppelin, using these methods will keep you locked forever in a time-boxed loop of rock music from the era.

I’m building a music discovery application that aims to change this paradigm by providing recommendations predicated not on tags or metadata or popularity, but on actual sonic similarity (with user-configurable chaos factor in case you want to shake it up a bit). And it comes with a twist: It emphasizes emerging indie artists. We’d like to contribute this discovery data back to the MusicBrainz database to ultimately help surface new artists to the public arena.

The basic loop:

The platform is anchored entirely on MusicBrainz data, specifically on MusicBrainz IDs. We crawl developer-friendly sources of audio signal, like Bandcamp, and bind artists to their platform pages by platform ID, running extensive analysis over the actual music. A human spot-checks the data, and approves the artist for display in the application. Ideally, the artist would then be submitted to MB and get an official MB ID via the formal curation process, which would then be re-ingested via the official fortnightly incremental DB exports and used to display a MB acceptance/verification state in the UI, not unlike verified Twitter/X accounts.

The app’s contributions:

  1. Underground artist submissions: We have a discovery pipeline that surfaces Bandcamp artists with real catalogs that MusicBrainz doesn’t know yet: Self-released, no distributor, often no presence in any database. Before we submit anything, an artist passes full triage: Verified catalog, audio fingerprinting against our corpus (no bootleg re-uploads), and human validation. A submission to MB would carry: name, sort name, area (from their source profile), disambiguation as needed, and URL relationships (streaming sources, official/social sites they list). Planned volume is ~50/week to start, batched and throttled well under the rate guidance, with every edit including an evidence note. This will all be done using a dedicated bot account (crates_bot, already created) per the Bots guidelines, and tested on test.musicbrainz.org first.

  2. Genre tag submissions: We score every artist’s actual audio against the MB genre tree (using calibrated z-scores, not raw model output). We’d like to submit our top tags as regular folksonomy tags via the ws/2 API on artists we have high confidence about. Same bot account, same throttle.

Questions for the community:

  • Any concerns about the artist-submission criteria, volume, or value?

  • Preferences on the evidence-note format for discovered artists?

  • Anything we should know that the Bots doc doesn’t cover?

We lean hard on MB and want this to be the kind of bot the community enjoys. Please let me know if this integration seems like a good fit, happy to make adjusts. I’d also enjoy any general feedback or ideas you might have on the application concept, this is basically something I built for myself to use and thought it might be awesome to give it to the world.

Thanks for your time!

2 Likes

always encourage anyone to make the most out of whats in the database.

I think a lot of people (not just in the MBz/LB community but further afield) would highly appreciate a toggle to filter out AI nonsense; we have had lengthy discussions on the forum on how to handle such music and the general agreement is such music should be tagged.

1 Like

Thanks for your comment.

Do you mean you’d like us to try to identify and tag AI-derived audio? I think that’s totally possible + feasible and will add that feature to the application roadmap.

1 Like

yep that is what people would probably want - automatic identification of AI music