Dream database: what it could be

Tags: #<Tag:0x00007f7d06e45248> #<Tag:0x00007f7d06e45180> #<Tag:0x00007f7d06e450b8> #<Tag:0x00007f7d06e44ed8>

Hi all,

In a simplified form, the database block diagram from the point of view of discography can be represented by the following line:

Artist -> Release group -> Release -> Recording -> (Relationships to Work, other Artists, Area, etc.)

Looking at this line, namely in its center, it seems that a release generates recordings. But this completely contradicts logic and chronology, and most importantly, it does not allow to enter other important information into the database concerning the creation of recordings and their real relationships.
Those who are engaged in discographic research of either artist’s work have all this information, but unfortunately this information cannot be entered into the database, because there are simply no necessary entities for this.

Below, I will try to describe the database as it could be for those who want to see it logical, really complete, and does not requiring multiple entries of identical data.

To do this, two new entities must appear in the database, and one entity must be significantly modified. In addition, it should be possible to inherit relationships from an entity to an entity of inferior level. Therefore, readers may consider this text to be delusional or a utopian request for a very distant reworking of the database. So be it.

  1. Top level: Session
    For this entity, the existing Event entity is suitable, but it is subjected to refinement (list of titles is to add, etc.).
    Currently, we do not have any session information or even a concept of “session” in the discographic database. Isn’t it strange?
    There are two subtypes:
    1.1. Recording session
    Contains relationships as for actual Event.
    Includes: Recorded titles (list).
    1.2. Overdub session
    Currently, we do not have a concept of “overdub” in the discographic database. Isn’t it strange?
    Contains relationships as for actual Event.
    Includes: Overdubbed titles (list).
    (Note that there is no Relationship with the Work at this level.)

  2. Next level (top to bottom): Performed title
    A new entity.
    This is a set of takes of a title recorded or overdubbed at a session.
    At present, we do not have information in the discographic database on how many and what takes of the same title were recorded by an artist at a session, or even a concept of “take”. Isn’t it strange?
    Can be standalone, when information about the corresponding Session is not available.
    There are two subtypes:
    2.1. Recorded title
    Inherits all RS of Recording session with option to modify them.
    Contains RS to Work.
    Includes: Recordings (list)
    2.2. Overdubbed title
    Inherits all RS of Overdub session with option to modify them.
    Contains RS to Work.
    Includes: Overdubs (list)

  3. Next level (top to bottom): Performance
    A new entity.
    Performance is what an artist recorded or overdubbed at a session as a separate indivisible object (take, false start, and even studio noise between takes or false starts: speech, chords, chatter).
    Can be standalone, when information about the corresponding Performed title is not available.
    There are two subtypes:
    3.1. Recording
    Not to be confused with the existing Recording entity (see below).
    This is one of takes of a title recorded at a session.
    Inherits all RS of Recorded title with option to modify them.
    Includes: Tracks (list)
    3.2. Overdub
    This is one of takes of a title overdubbed at a session.
    Inherits all RS of Overdubbed title with option to modify them.
    Includes: Tracks (list)
    (Note that the same recording can be implemented (and accordingly released) as many different tracks.)

  4. Lowest level: Track
    This is an existing entity actually called Recording.
    Track is what an engineer made from a Recording or Overdub by means of editing and/or mixing.
    Inherits all RS of Recording or Overdub with option to modify them.
    As actually, included in: Releases (list)

Thank you for attention.

3 Likes

Interesting.

The current definition of a recording in MusicBrainz: https://musicbrainz.org/doc/Recording
And a Track is also clearly defined: https://musicbrainz.org/doc/Track

About recording sessions, overdubs, etc… where do we get these infos? Most of the times they aren’t documented publicly, eventually artists and recording engineers may know, but will they share it?
What I mean is, while having detailed data is something we definitively seek, we focus on data which can be verified, which is available without too much hassle, because, well, that’s just more realistic.

That said, one can see a recording session as an event, we currently have means to store place, date, people involved in a published recording as relationships, and where this recording appears (see for example https://musicbrainz.org/recording/bef3fddb-5aca-49f5-b2fd-d56a23268d63)

Linking a recording to an event (=recording session), or to another recording (overdub = one recording + another recording) should be possible without changing current definitions.

You are raising interesting points though.

5 Likes

This is very interesting!
I had to use session informations for several artist-recording or place-recording relationships.
But being able to conveniently set them as entities, although representing more work, may leave things incredibly easier for future edits around those sessions.

Notice that the session type for event has already been requested and discussed (I didn’t read yet):

@RocknRollArchivist, what is it that you call RS?

2 Likes

Interestingly, but the figure in the definition of the recording you linked is fully consistent with the structure I proposed! What is stored in a database called “Recording” is actually a “Released Track” (green rectangle in the diagram), and not orange rectangle “Recording”.
(By the way, the green rectangle “Release Track (remaster)” is to remove, because the RS “Remaster” has long been deprecated.)
And in fact this is not a recording, but a track (one of implementations of a recording), as this is just stated by the diagram, merged with close tracks from other releases, because there are others tracks of this same real recording (edits, remixes), also mistakenly (in the view of the diagram) called Recordings in MB. Mistakenly, because these multiple so-called “recordings” are in fact different implementations of same unique recording made by an artist at a recording session.
E.g., an artist recorded a take of a song running 3:00 selected as master for single as for album. But for single release this recording was edited (faded-out), so the single side runs 2:00 only. This case is represented by two different Recordings in the database, because a “recording” running 2:00 evidently can’t be merged with the one running 3:00. Whereas both are from the same original recording.

Sorry, but this is not even an entity. An entity “Recording” should form an array of entities “Track”.

Sorry, but you probably haven’t studied some boxed sets called “Complete Recordings” or “Complete Works” with a 200-pages booklets for a long time. All this information is in the booklets. Here are just two examples of releases I have, but I assure you that such information has been published for dozens, if not hundreds of artists:

  1. https://musicbrainz.org/release/c760a84b-ae81-4b87-b8dd-57a61a3f7fbc/cover-art
    Please look at pages 29 and 30 of the booklet.
  2. https://musicbrainz.org/release/1792195f-bf23-4fa6-a5b3-38d3de31ff0f/cover-art
    Please look there at pages 47 to 68 of booklet “Book 1”.

(By the way, my real name is printed twice on the page 215 of this book: as “Tape Comparison and Analysis” and as “Discography”. So I dare to assert that over the years of discographic work I have deeply understood the problem and am fully responsible for the correctness, expediency and efficiency of the above structure.)

Unfortunately, the MB database simply lagged behind the modern level of discography in development.

Sorry, I can’t agree with you.

You have overlooked the two proposed new entities, without which this structure is not complete and will not work.

You have overlooked also the proposed inheritance of relationships. For example, at the same session 5 titles were recorded, each with 10 takes. When all these 50 recordings are present in the database, the editor is forced to stupidly repeat the same lot of relationships 50 times! Whereas it would be correct to introduce these relationships once for the whole session and ensure inheritance of these relationships for each of the levels invested in the Session up to the Track.

1 Like

[quote=“jesus2099, post:3, topic:478597”]
This is very interesting!
I had to use session informations for several artist-recording or place-recording relationships.
But being able to conveniently set them as entities, although representing more work, may leave things incredibly easier for future edits around those sessions.[/quote]
Thank you, @jesus2099.
Please look at the Wikipedia page I created, “List of songs recorded by Fats Domino”.
Currently, due to imperfections in the database structure, this information about recording sessions, which is obviously of interest to many people, cannot be entered and stored. That is why I propose improving this structure.

Sorry. This is evidently RelationShip.
P.S. The first quote can’t be formatted correctly, I don’t know why.

1 Like

If you want the whole figure about current database structure: https://musicbrainz.org/doc/MusicBrainz_Database/Schema

2 Likes

As I see it, a Recording is “a piece of audio”. Ie., a Recording is the entity we have that represents the given audio you can hear in any given context. This also means that the current Recording entity could easily be made to encompass all the different levels you suggest and the relationship between them could easily be expressed with, well, relationships.

There is no material difference between your “Performance” and “Track” [note 1]. They are both “a piece of audio”. How they are sourced and how they are cut and how they are eventually used might be different, but at the end of the day, they are ‘just’ audio that happens to have been put to a medium.

I’m not sure about your “Performed title”. Is this actually supposed to represent audio directly? Or just a collection of audio? In the latter case, a Series entity type might do the job just as well—and in the former case, my comments above would apply here as well.

Keep in mind that whenever you want something to be included as a list, that means that it could just as well be provided as a relationship, and we already have several Recording–Recording relationships for when a Recording is included in another Recording in one way or another. We could easily have more of these.

It might also be possible to expand the Recording entity with a “Type” property like we have for Works, Artists, and others, so it’d be possible to differentiate between “Performance” recordings and “Track” recordings as well as potentially other types.


Note 1: I will be using the suggested definition of “Track” here, even as I highly disagree with changing what is considered a Track again. We did this once and it has caused a mess with regards to tag name standards and other things. (Also, a Track is the physical manifestation of “a piece of audio”, e.g., the engraving on a vinyl or magnetic encoding on a tape—a Track is unique per medium (a CD and a vinyl has different tracks), even if they encode/carry the same “piece of audio”.)

5 Likes

Sorry, but it’s easier for me to deal with specific data than with abstract concepts like “a piece of audio”. Therefore, I’ll just give a specific example of one recording session of Little Richard and ask you to answer how to fulfill the main purposes of the proposed new entities 2 and 3 (the text in italics) in the existing database structure.

I would also love to hear your opinion on the inheritance of relationships from session to track.

  1. Session

1.1. Recording session
1956-08-01 - Cosimo J & M Recording Studio, 525 Gov. Nicholls St., New Orleans, LA - Little Richard [vcl, pno], Raymond Montrell [gt], Frank Fields [bass], Earl Palmer [drums], Lee Allen [tenor sax], Alvin “Red” Tyler [baritone sax]
Recorded titles list:

  • Shake a Hand (link to 2.1.1)
  • Can’t Believe You Wanna Leave (link to 2.1.2)

1.2. Overdub sessions

1.2.1. 1959-02-09, Los Angeles - vocal chorus by The Stewart Sisters (Trudy Hancock, Irene Diaz, Darlene Paul)
Overdubbed titles list:

  • Shake a Hand (link to 2.2.1.1)

1.2.2. 1970-09-01, Los Angeles, Paramount Studio Los Angeles - produced by Mike Akopoff, arranged by Bob Harmon with musicians: Bob Harmon and Don Kerian [tp], Terry Woodson [tbn]. Stereo
Overdubbed titles list:

  • Shake a Hand (link to 2.2.1.2)
  1. Performed title
    The main purpose is a list of all available takes of this title recorded or overdubbed at this session. It also allows to separate the records of a given session from the recordings of the same title at other session(s).

2.1.1. Recorded title - Shake a Hand

  • take 2 (link to 3.1.1)
  • take 4 (link to 3.1.2)

2.1.2. Recorded title - Can’t Believe You Wanna Leave

  • take 6 (link to 3.2.1)
  • take 8 (link to 3.2.2)
  • take 9 (link to 3.2.3)

2.2.1.1. Overdubbed title - Shake a Hand

  • take 4 + sole overdub take from 1st Overdub session
    2.2.1.2. Overdubbed title - Shake a Hand
  • take 4 + sole overdub take from 2nd Overdub session
  1. Performance
    The main purpose is a list of all available tracks derived from this recording or overdub. It also allows to separate the tracks derived from each different recording (take) of a given session between them as from the tracks derived from the recordings of the same title at other session(s).

3.1. Recording

3.1.1. Recording - Shake a Hand, take 2
Tracks list:

  • Shake a Hand, take 2 (link to 4.1.1)

3.1.2. Recording - Shake a Hand, take 4
Tracks list:

  • Shake a Hand, take 4 undubbed with chat (link to 4.1.2.1)
  • Shake a Hand, take 4 undubbed with false start & chat (link to 4.1.2.2)
  • Shake a Hand, take 4 overdubbed by vocal chorus (link to 4.1.3)
  • Shake a Hand, take 4 stereo overdubbed by wind instruments (link to 4.1.4)

3.2. Recording
3.2.1. Recording - Can’t Believe You Wanna Leave, take 6
Tracks list:

  • Can’t Believe You Wanna Leave, take 6 (link to 4.2.1)
    3.2.2. Recording - Can’t Believe You Wanna Leave, take 8
    Tracks list:
  • Can’t Believe You Wanna Leave, take 8 (link to 4.2.2)
    3.2.3. Recording - Can’t Believe You Wanna Leave, take 9
    Tracks list:
  • Can’t Believe You Wanna Leave, take 9 (link to 4.2.3)

3.2. Overdub

3.2.1. Overdub - Shake a Hand, take 4 + sole overdub take from 1st Overdub session

3.2.2. Overdub - Shake a Hand, take 4 + sole overdub take from 2nd Overdub session

  1. Track

4.1.1. Track - Shake a Hand, take 2

4.1.2.1. Track - Shake a Hand, take 4 undubbed with chat

4.1.2.2. Track - Shake a Hand, take 4 undubbed with false start & chat

4.1.3. Track - Shake a Hand, take 4 overdubbed by vocal chorus

4.1.4. Track - Shake a Hand, take 4 stereo overdubbed by wind instruments

4.2.1. Track - Can’t Believe You Wanna Leave, take 6

4.2.2. Track - Can’t Believe You Wanna Leave, take 8

4.2.3. Track - Can’t Believe You Wanna Leave, take 9

2 Likes

A little bit about history and evolution.

The current database structure has developed historically. In ancient times, the “center of the universe” in discography was release. The artist’s discography was limited to listing his singles and albums. Only the most advanced discographies printed album tracklists… And these tracklists were the only source of information about the recordings.

When I first visited MB, no attention was paid to the recordings here. There was no such entity as a recording. And since I was interested just in the differences in the recordings under the same title, I left the site and forgot about it for a long time.

As time passed, the concept of discography began to change. European labels began to produce boxes, in the booklet of which a list of artist’s recordings was given in the chronological order of their creation, and not with reference to his albums. One of the pioneers in the release of such boxes was in 1982 the British label Charly Records. They released, under licensed US label “Sun”, the boxes named “The Sun Years” of Carl Perkins (SUN BOX 101), then Jerry Lee Lewis (SUN BOX 102), and so on. Please take a look at the discography (although it has not yet been so named in the booklets) of Carl Perkins and Jerry Lee Lewis.

But later in 1986, the German label Bear Family Records, under the heading “Discography”, actually gives the same sessionography as on booklets by Charly Records (then this term did not exist yet).

With the advent of the CD era, such releases called “Complete Recordings” or similarly (boxed sets with sessionography) have become widespread.

Thus, today the natural source for a really serious discography of a famous artist is his chronological sessionography, and not releases such as singles, albums and compilations that reflect only part of the legacy created by him, and in a completely random, chaotic order, reflecting only the current marketing ideas of the labels.

These releases only snatch individual fragments from the artist’s recording stream, often in a modified form, again for the sake of technical capabilities for producing vinyls and current market needs, for example, of the 1960s. What do we care in 2020 that the female chorus overdub made without the knowledge and desire of the artist, simply because “the public liked it”?

The collectors are looking for the original recordings of the artist, without subsequent overdubs, edits, remixes and other distortions, and we can help them with this.

So my proposal is aimed at improving the quality and competitiveness of the database.

Note 1. But even in September 2012, when I added this box to the database, the database didn’t have such a thing as “box,” so you see here: “Packaging: Other”.

Note 2. But even in May 2020, there are people in the discographic community who ask: “Where do we get information about the sessions?”

Note 3. But even in May 2020, such well documented releases (very precious for collectors) did not find a classification appropriate to them in the database and are called “Compilations”, in the old way. As if such releases can be equated with the numerous cheap compilations “The Best of” or “Greatest Hits” collected somehow from long-released and re-released albums, which have cluttered the market (and the database). At least, no way to distinguish them, besides “Other”. But what does “other” mean, especially if they are placed at the end of the list of release groups? This means evidently “of no interest”. Such releases are welcome to receive a special type of release group (such as “Complete”) and be placed in the list of release groups of artist immediately after albums, before ordinary compilations.

3 Likes

Sorry, but I’m strongly disagree. Having an extensive collection of releases by several artists and analyzing and comparing tracks in sound editor for many years, I can confidently say that even on the same medium the tracks of the same recording are different (often very significantly) from each other. So if we talk about the uniqueness of the track, then not per medium, but per release. How many different releases exist, as many different tracks derived from the same recording exist.

I share some of the same interests as @RocknRollArchivist in this area. In addition to the boxed set sources, there are also numerous discographic resources available in book or website form. The Johnny Cash and Louis Armstrong discographies are two excellent examples of the latter.

My thoughts on structure are a little different, but I think they would require less restructuring of the existing data model. It would involve splitting the existing Recording entity into two, which for the purposes of discussion I’ll call Take and Master (although it would make sense to keep the name Recording for actual implementation).

A Take represents the most granular data we have about a given performance. This is, I think, very close to the OP’s Performance entity. However, it’s only as granular as the data we have available, so for many of the recordings in MB, we would only have a single “take” that may actually be drawn from multiple recording sessions. (For instance, “Darkness on the Edge of Town” was recorded over a long period, and presumably multiple actual takes across multiple sessions, but if we lack data on what was recorded when it would be treated as a single “take” with recording date 1977-1978.) Performer relationships, recording date and location, and associated work(s) are some of the current Recording relationships that would become relationships of a Take.

A Master is essentially the product of one or more Takes prepared for release. Recording relationships like mixed by, mastered by, etc. would become relationships of the Master.

The bulk of the current MB Recordings would likely become one Take and one Master, because we simply don’t have finer grained data.

Overdubbed recordings would be represented as one Master associated to two or more Takes.

An edit, new mix, or (potentially) remastering would all be separate Masters sharing the same Take (or set of Takes). Since performers and dates are associated to the Take, they only need be entered once and can carry over to all Masters.

I’m not sure I see the value to the Performed Title entity. It seems like this would be derived from the relationship between Session and Performance (or Take).

I’m also unsure about the merits of the various Recording/Overdub subtypes. At the session level, at least, these are not always clearly divided. In other cases the finished recording is edited together from multiple complete and partial takes - are those regular recordings or overdubs? Perhaps “overdub” could be an optional boolean property rather than a subtype.

What about a really serious discography tomorrow?

I don’t see appealing to collectors with very exact interests as increasing the ‘competitiveness’ of the database. That’s not to say that I disagree with the logic here, but I am of the belief that the current (at least in appearance) complexity of MB is what decreases it’s competitiveness.

The web 2.0 world that MB lives in requires it to be appealing for users to not only ingest the content on offer, but also to create it. That said, if your structure doesn’t require me to know what a session or an overdub is to enter data, I wouldn’t see a problem.

3 Likes