Smart way to filter unknown entities


I’m trying to filter the music brainz DB, for a project I’m working on, of an akinator music guessing game.
I’m working with MYSQL and want to filter the DB schema so that only relatively known entities (artists) will appear, and their respective relationships.
Is there a way in which I can do it elegantly? Is there such implementation somewhere because its probably something that someone already did.


You could cross-reference with the data dumps from ListenBrainz and give some kind of listens/month or percentage-of-users-having-listened limit to mark a cut-off.

Just going by relationships, something like ROD might make your list, even though they’re very niche. You’d also likely get a ton of Japanese and Korean pop etc. groups that might not be well known to the general populace.