Algorithms to identify Classical releases

Tags: #<Tag:0x00007f29ffddc950>


There is a requirement when tagging using MusicBrainz data to identify when a track is/not a classical track, most notably in order to set the track artist to performers/conductors rather than composer. Ive started work on some heuristics to do this, note the idea is to find Classical releases whether or not they have been entered using CSG without incorrectly identifying nonclassical as classical, and clearly there are edge cases about what is classical, but there are always edge cases.

Great to get some feedback on it:
If recording has a work type and it is Song then Not Classical
If recording artist credit has different artists ids to trackartist credit then Classical
If recording has two or more levels of works than Classical
If title contain certain words (such as allegro) then Classical
If release by a Classical only record label (such as Hyperion) then Classical
If track has a composer relationship and they died more than 100 years ago then Classical
If track has a conductor relationship then Classical


I can’t see any of those except maybe the classical label being consistent enough to be helpful here.


Definitely not the case, there are lots of classical songs. Schubert wrote hundreds :slight_smile:

Applies also to a lot of old pop (think 40s, 50s).

Most of the others are not enough on their own, but might generally be good enough if multiple of them are hit at the same time.


Interesting. I agree that while none of these factors alone is good enough, taken together they might produce fairly consistent results. But I’m not sure what your goal is. Are you trying to determine:

  1. whether the music itself (compositions) are “classical”?
  2. whether the release should be filed under “classical” for broad categorization purposes?
  3. whether the CSG was used to enter the release in MB?


Good question, I suppose Im trying to do 2> (so whether or not classical should really be considered after considering all the tracks in the release), and my main reason for doing so is to try and add the performing artist/orchestra/conductor as the track artist rather than just using what MusicBrainz stores as track artist. If MusicBrainz is using CSG the trackartist will be wrong but hopefully can derive from recording artist and/or relationships, if MusicBrainz is not using CSG then trackartist may or may not be correct but knowing if Classical directs me whether to just use trackartist or try some other things.


Thanks did not know this.

Agreed, Im leaning towards this


It seems to me that a better way to do this is to have MusicBrainz store a field which says which style rules the editors use on this Release: Ordinary, or CSG.

A still better way to do this would be to figure out what value the CSG is delivering, and find a way to put that into a better database schema and better web UI, so that every Release follows the same style guide. For instance, find a way to generate CSG-compliant strings to use as “track title” in taggers, from the track titles as credited, and the Works to which the Recordings are linked. But that approach takes a lot of work in database design, web UI development, and data improvement.


I have already requested that as MBS-9020 and that would be a great help, however I’m looking for a solution now and I would still want to find classical releases that arent using CSG because even not using CSG they need dealing with differently to Pop/Rock since the track artist would still proably contain the composer plus other things.


[quote=“ijabz, post:5, topic:133537”]my main reason for doing so is to try and add the performing artist/orchestra/conductor as the track artist rather than just using what MusicBrainz stores as track artist

In that case shouldn’t the recording artist fill the bill perfectly?


Some of these might work together but there’s always going to be some exceptions.

I believe we already got some thousands of classical songs added as MB works.

Many soundtracks could have similar situation with classical thanks to soundtrack guideline (track artist=composer, recording artist=performer). Some film scores could be counted as classical but there’s quite many jazz scores from 50s and 60s.

Not only limited to classical. Incidental music for theatre plays could be from any genre but would still have acts or scenes like operas. Danny Elfman composed music for circus performance (Cirque du Soleil) and this was divided to acts which were having names. Most of the Film/TV/Game soundtracks are having 2 levels if separate pieces are linked with master work, see for example video game soundtrack for The Great Giana Sisters.

It’s common to use Italian tempo markings with classical but naturally same words can be used on any Italian titles. Usage in music isn’t limited only to classical. Just try to search recordings having “Allegro” on their titles and you’ll notice how bad this idea is. You could also try “Symphony” for useless results.

Hyperion might be some of the ones limiting only to classical but for example Deutsche Grammophon, Sony Classical and Naxos are also releasing some other stuff. It depends how wide our definition of classical is. Most of the biggest classical labels release film score recordings.

Wikipedia definition for classical music is “Classical music is art music produced or rooted in the traditions of Western music” so if we got some hundreds of years old African of Chinese music we might not want to count it as classical. Even in western music we got old folk music which commonly isn’t counted as classical. How about old religious hymns? National anthems?

Jazz orchestras (big band) and studio orchestras performing for Movie/TV/Game soundtracks often have a conductor. Choirs also are commonly having conductors but not all choirs perform classical works.


Yes sometimes it would be, but its inconsistent so I look at the artists relationship and the Recording Artist field to derive the best value. But even if it was fine I still need to know if release is classical because I wouldn’t want to use Recording Artist for non Classical releases I would want to use TrackArtist.


I just did this and actually to be honest the results aren’t bad if considered as one of multiple indicators, the vast majority do seem to be Classical


Do you mean that even if a ‘CSG flag’ were implemented today, you’d need to find a more immediate solution (since obviously the flag would require manual editing)?

Because—while I realize it’s not as clean a solution—we could start using a ‘CSG:true’ folksonomy tag today. That would have an interesting side-effect, which is that the ‘truthiness’ would be (theoretically) weighted by votes, not simply true|false.


Majority for sure. It naturally depends what are your needs. On first page of this search there’s 25 recordings. 5 of them aren’t classical (Garcia, Mauceri, Petitgand, Wono, Zorn) so it’s 20% of wrong results.

Recording by “Hollywood Bowl Orchestra, John Mauceri” is a recording of 1947 musical and is having a conductor relationship. For this we could use 2 rules from your list and misidentify it as classical music.


I’m just saying I think its a useful indicator, but when I look at this release it does appear to have been entered as CSG, note the difference between track and recording artist credit so might be right to treat this as if it was Classical anyway even though it isnt really.


It most likely follows either theatre or soundtrack guideline.


Oh wow I did not realize that these guidelines break the regular recordingartistcredit/trackartistcredit relationship as well (i,.e the recordingartistcredit is the same as the trackartistcredit for the first release it is added for).

Im not sure what users would like to see as the artist field in their tagged files for such releases but its certainly problematic.


A thought maybe to ensure some sort of consistency the standard definition of recording artist should be changed. It currently says

The artist should usually be the same as the first release of the recording.

Ive always understood this to be the standard track artist for this recording, only differing when the recording appears on a release where it credited slightly differently. (Whereas when using CSG we no longer have this standard concept instead it is used for only the performers rather than whoever is credited, and the track artist is used for composer)

But if instead we abandoned this standard concept and said something like

The artist should be the most important performers of the release

this could be said to be broadly similar for all types of release instead of how it is now when it means completely different things for different releases.

To put it another way for non-classical is my intepretation of the recording artist as the standard artist correct or not, and if it is the standard recording concept actually useful ?


But for non classical they’re the same anyway in almost every case aren’t they?


Mostly but not all the time, track artist is the correct value to use.