Making a database useful for music education, help?

Hi all,

I’m a free/libre/open (FLO) activist culturally and technologically, and I
teach music lessons for a living. I started making notes about basic
chord structures in songs which expanded to noting other analytic
features like meter and so on.

So, for simple example, I noted:

  • Songs that loop just two chords with a repeating period (which means a two-phrase group where one finishes unresolved and the other resolves) where first phrase starts with I and ends with V, second phrase starts with V and ends with I. In short: | I… V | V… I |
    • Polly-Wolly Doodle
    • Jambalaya On The Bayou — Hank Williams
    • Hang Down Your Head, Tom Dooley
    • Shoo Fly
    • Down in the Valley
    • Mein Hut
    • Animal Fair
    • Hush, Little Baby
    • Where Oh Where Has My Little Dog Gone?
    • Marianne (Caribbean folk song)
    • Pay Me My Money Down (Caribbean folk song)

And that’s in the much simpler beginning list.

Later, I’ve noted that Mad World by Tears for Fears and Scarborough Fair are examples of songs in Dorian.

I’ve got places where I made the brief chart of the chord pattern (numerically, key is not the point, although the key of a particular recording could be noted) for specific songs that may not match precisely the pattern in any other song but fit in categories such as using a certain list of chords, having certain form…

I started noting these features for the ideal ultimate database:

  • chords

    • 1 power-chord / drone
      • able to skip all maj/min/7 distinctions if sticking with this
    • 1 major chord
    • 1 minor chord
    • I & V
    • full cadence
    • half cadence
    • syncopated chord changes
    • I & IV
    • I IV & V
    • moving between IV & V directly
    • plagal cadence
    • vi chord, relative minor/major
    • ii chord (iv of vi)
    • iii chord (v of vi)
    • 7th chords
      • dom 7th chords
      • m7 (stacked triad M over m)
      • M7 (stacked triad m over M)
    • secondary dominants
      • III chord (typically V of vi, can lead to IV though)
      • II (V of V, but can lead to IV)
      • VI (V of II)
      • circle progressions generally (sequential secondary dominants)
    • borrowed from parallel minor
      • bVII (as V is to vi)
      • bVI (as IV is to vi)
      • bIII (as I is to vi)
      • iv in major key
    • minor v in major key
      • as ii for moving to IV as new key (or temp modulation)
    • key changes
    • chords with 9s
    • diminished triad (upper part of dom7)
    • half-dim (m7b5) (upper part of dom9)
    • fully diminished
      • upper part of dom♭9 or dom7 w/ root moved up
      • stacked m3’s, looping/sliding
    • augmented
    • descending bass progressions
    • flamenco 4-3-2-1 / 8-7-6-5 progression
    • altered mediants
  • Form

    • simple loop
    • periods
    • harmonic rhythm
    • sections
      • verse
      • chorus
      • bridge
      • intro / outro
      • pre-chorus
    • 12-bar blues etc
    • through composed
    • evolving (minimalism)
  • melody

    • all in chord, arpeggio melodies
    • non-harmonic tones
      • passing
      • neighbor
      • suspension
      • others
      • accented vs unaccented
    • pentatonic melody
    • major scale melody
    • minor scale(s) melody
    • non-diatonic harmonic 7th in melodies
    • chromaticism
    • modal melody
    • other scales
    • wide tessitura
  • rhythm

    • duple meter
    • triple meter
    • compound meter
    • swing rhythm
    • changing meters
    • complex/odd meter
    • polyrhythms
      • hemiola
    • polypatterns
    • tuplets

This is far from complete, just shows my thinking. Ideally, a database would let people go both directions: (A) look up a song and see a list of what chords it uses, it’s basic form, etc. so a student would know what skills and knowledge are needed to play that song as they know it (but they could always simplify where needed) and (B) do a database search for songs that use only the chords or other features searched for.

Rather than build a new database from scratch, I would prefer if it were
possible to build these things into an existing FLO database. Would
AcousticBrainz or other here possibly make sense for this or be useful?

I’m less interested in marking every tiny nuance in every version of a recording of a song and more
interested in marking the basic structure.

I’d rather found a community or join a community than make this all just my own quirky project. I could imagine tons of AcousticBrainz features being tangentially useful, but probably not directly what I’m looking for.

Any thoughts on this? Suggestions?

1 Like

Some of these would probably be good candidates for Work and/or Recording attributes (and, to some degree, some already are). Take a look at the Work Attributes section next time you add or edit a Work in MusicBrainz.

Speaking of Works,

should generally be what MusicBrainz Works are.

One thing I would recommend though, is looking into making a database of some sort on top of MusicBrainz. E.g., have your own replicated MusicBrainz database instance and then use, e.g., Work MBIDs/gids as foreign keys in your own projects database. This means that you don’t duplicate efforts of adding Works or assigning composers, lyricsts, external links and identifiers, etc., and it also means you’re free to deal with all the extra data exactly as you see fit. (This is essentially how services like CritiqueBrainz, AcousticBrainz, AcoustID, etc. work—they have both their own data/database which has MusicBrainz identifiers for getting additional data from MusicBrainz.)

Thanks. For reference, while I’ve been aware of MusicBrainz and contributed via the interface of some programs that use it, I’ve not yet volunteered directly.

One thing I would recommend though, is looking into making a database of some sort on top of MusicBrainz.

That sounds like the best approach, and I’d like to invite everyone interested in this idea to discuss it here to figure out the best way to do this.

It looks like I could basically be adding tags for all the works in question and define tags, but some patterns are common enough that a tag makes sense while other cases are particular enough. I think, just focusing on chord structure, that it would be ideal to have a set of common tags for common patterns, then tag all the songs that contain those patterns. But it would be good to also have a full structure written out in a different field other than tags for each Work that goes beyond being just one or two of the tag-level patterns and that’s it.

The other concern is that this work can’t use the NC restriction, it needs to be a Free Culture work, and I’d prefer the database to be CC-BY-SA, although that would be unfortunate in being incompatible with the supplementary MusicBrainz stuff that is NC (I blame the NC side though, that’s the core source of compatibility problems, that unfortunate non-free restriction). I suppose this means the new DB would have to be on top of MusicBrainz and could only use the core MusicBrainz DB even though my work is itself not at all commercial.

Would it work well to have a DB that just uses a MusicBrainz ID for Works, then has its own fields for various types of tags and other appropriate meta data? Rather than pull in the enormous MusicBrainz DB, if my DB has only hundreds or a few thousand total entries (until it grows as a community project), it should be itself pretty small and not need to access more than the relevant Works from MusicBrainz.

This is sounding promising though… I would like to hear from anyone if they think any approach should be considered other than building a DB that uses MusicBrainz this way.


I notice that AcousticBrainz, especially stuff like ChordsDescriptors is relevant enough in an algorithmic fashion. The connection of MusicBrainz and AcousticBrainz means that the sort of more human-written education notes I’d be building would have easy access to other sorts of filters available there, which may be useful.


The CC by-sa-nc only applies to non-core data. This is at least folksonomy tags and ratings, maybe also annotations. Stuff like entity names/aliases and MBIDs are all CC0, so would be free for you to use/integrate into your own database. Once you have decided on a way to approach this, you can also contact us more formally through Contact - MetaBrainz Foundation (as you already did once :slight_smile: ) and Rob may be able to better guide you on how you can use the data without worrying about the NC clause.

You could possibly (in the frontend) just use web service calls rather than querying (a copy of) the MB database directly, maybe with some kind of cache in front of it. I think this is what does.

Hi, and thanks for your post. I am a classically trained-musician and for a long time I struggled to find a database with musical examples classified per types of musical elements just like the nomenclature you wrote. I started to work on a system to produced such database but I am still in the early stages. Do you have any updates regarding your work ? Best regards from Paris

1 Like