How to get Picard writing the original release date / year of a recording?

The reason this idea has been rejected previously is that we’d (much!!) rather have someone do the extra work (and yes, it is admittedly extra work) of actually adding the earliest known release to MusicBrainz. For a lot of vinyls, this can be done by importing from Discogs. (It is also possible to create a mostly “empty” release that e.g., has no medium/tracklist just to get the date in.) I’m not aware of anything to indicate that this position would have changed.

https://medium.com/@BlueTaslem/time-is-hard-for-computers-programmers-14ef2a7ece77
… and so on and so forth.

3 Likes

Yeah, I get the dream. And spend a lot of time doing that to fill in gaps in artists I am working on.

The trouble is the amount of research gets extreme. Until the gaps are filled is it fair to tell people to trust MB for those kinds of lookups? Are people not going to give up due to the bad data?

This is why I was thinking aloud about stop gap. If one is working via the API then there is no way to know how trustable results are without cross checking.

A Release has a “Data Quality” flag. Could there be something similar at Artist level?

I don’t understand how you would do any less research for adding a release that consists only of a title, artist, and release year.

If there exists such a field and you add earliest release date to “2000” but then it is discovered, there was a release in late 1999, the 2000 would have been wrong all along. If you add an “empty” 2000 release—and you know there was a release in 2000—then it was never wrong that there was a release in 2000, but now you just add the additional information that there was also a 1999 release (which can also be empty). But what if there was actually a self-published release in 1996 before they got signed (and the 1999/2000 release didn’t have any additional mixing of the recordings)? What if you add the 1996 release but “earliest release date” field is still set to 2000?

The field you’re lobbying for would not solve the issues you claim it would solve and would not lessen the amount of research needed, but will introduce data duplication that is liable to go out of sync.

(And again, note that all that is required for a release to be entered is a release title and a release artist. Everything else is optional. If you want to add a release from a year, you can do with just title, artist, and year. You don’t need to research anything beyond that. You don’t need to add any data beyond that. If you know a release came out in a given year, that is enough data to make it viable to add it to MB.)

1 Like

Sorry, it is a failing of mine that I couldn’t just add such basic details. I need to be sure about the data I add and then I’d need to be complete.

I’m not proposing anything, more a case of responding to the complexity of the question.

I’ll unsubscribe from this as my comments are clearly jsut confusing the situation.

Please don’t. You seem to be one of the few active members here that has insights and experiences from both sides of the border.
And you also don’t seem afraid to state opinions in an honest (and perhaps sometimes impulsive) manner.
I think that’s very valuable and applaudable.

Back to the original matter:

Since I am no coder (and clearly on the other side of the border) I grant myself playing dumb here to make my case.
I will also allow myself to completely ignore the behind the screen technicalities about ‘how complicated dates can be’, or challenges in having people entering dates correctly.

My simple account from an end-user point of view:

I am using MusicBrainz’ Picard tagger.
I am using it to match a release that happens to be a compilation (released in 2005), and it contains the Michael Jackson song ‘Ben’.
That song dates back to 1972. Nobody would disagree about that.

But, Picard will not retrieve 1972 as an ‘original date’.
It will retrieve 2005.

What would be needed to have Picard being able to retrieve and write 1972 in this case?

Is this really such an awkward or difficult request?

1 Like

Just answering from a purely Picard side of things: There would need to be an effective way to get this data.

Somehow this data can already be available in MusicBrainz, as MusicBrainz knows about the recording and where it has been used. With works it even knows about the concept of a song. So we could use MusicBrainz to lookup the earliest recording linked to the work. In the best case the recordings are linked to the work with a date set on the “recording of” relationship. For Ben this would even work. But as you can see for the many recordings linked to the work only 3 have set a date.

If we don’t get dates from the work-recording relationships or if we want to be safe we would then have to query each recording to get a list of releases and their dates.

In case of Ben this would add 1 request for the work and 84 requests for the recordings to be sure about the date. That’s at least 85 seconds more, and this is just for one track on the compilation.

However, having written this down I realize we could easily provide a plugin which does only the single work request and relies totally on the dates on the “recording of” relationship. This plugin would probably provide good enough results in many cases, and in cases where it doesn’t it would be not too difficult for an interested editor to add the data:

  1. make sure there is a work for the recording being looked up, if not create it and link it to the recording
  2. link the recording to the earliest recording findable, make sure to set a date

However, there is of course the distinction between recording date and release date. If this should be purely about release date there is currently no way around looking for other releases containing either the same recording or another recording of the same work.

3 Likes

I still don’t understand how a “first release date” release group field/property would prevent this? You’ll still have to do the same amount of research anyway to fill out this field (just you’d have to throw most of this research out if only adding the date)?

1 Like

I really like the idea that @Freso suggested in MBS-1424.

2 Likes

Yes, there really needs to be a way to get dates for individual recordings if we want. I don’t know how accurate it is, but iTunes will tag compilation tracks with an original date. Sometimes I’ll rip a compilation with iTunes just so I can copy that info to the original date field before I do the rest of the tagging in Picard

Picard needs to look up the original release date on the recording not on the album’s release group.

Right now %originalyear for a CD compilation of Buddy Holly’s music, will return something like 1996… which is when the compilation CD is released… but if you just look at any of the individual recordings…
for example:


Then it’s obviously listed, right there at the top of the list, 1957

That’s the year value we want to use to tag the recording/track in Picard.

The data is right there!

1 Like

@foxgrrl Please see my posting above for the issues this currently has from the perspective of Picard. It’s totally possible to write a plugin doing this, but depending on how thoroughly you want to check for the earliest date this would have a at least very heavy up to absolutely horrible impact on performance. But I would love to see a Plugin for this being implemented nevertheless.

In order for this to become an option in Picard itself the performance would need to be much better. This will only be possible if the server can provide an efficient way to retrieve this data.

4 Likes

I requested something similar here:

Prefer lowest year option - MusicBrainz Picard - MetaBrainz Community Discourse

It’s not a problem if it would be slow, I need to do it just once for few hundreds of files, even hours would be OK.

BTW from the other topic, if I search for “Artist - Title” in google, it will (almost) always finds the video and release year and it’s the first release, correct - which database is google using?

Try it with any song. The only problem is, if the title has the same name as album, in this case the album is found, but the year is correct here too.

Ok - here’s a few more gotchas on what constitutes a recording first release in MusicBranz.

  1. Many almost-identical releases within a release group have different recordings in MusicBrainz despite the releases themselves using the exact same recordings. (I do not have any analysis of how common this is, but my personal anecdotal experiences suggest that it is widespread.) Equally some releases may be pointing to the same recordings, but are actually using re-recordings (which definitely should be different recordings in MB), or could be using re-masterings / remixes / simple edits of the original recording (should these be different recordings?).

So you cannot be sure that the date you get is actually the original release date for that artist / track.

  1. Best of Albums may use a new MB recording object or may refer to (one of) the recording(s) of the original release.

  2. You may get a Best of Album which uses a recording attributed to a NAT (single) release - and so the recording first release date on the Best of Album may actually pre-date the recording first release date on the original album.

  3. At the moment, if you are using %originaldate% or %originalyear% (based on Release Group first release date) as part of the file naming path, at the moment all the tracks from the album are put in the same directory because the date is the same for all tracks and so all the tracks in the album are put in the same directory. If you start to use the recording first release date as original date and as part of the path, then different tracks can have different original dates, and you will need to explicitly change your file naming script to use a different variable containing the Release Group earliest release date otherwise you will get a different directory for each different original date on the album.

  4. It may or may not be possible to get a more definitive first release date by looking at all the recordings attributed to a work. But this brings its own gotchas - what about cover versions by different artists (e.g. Johnny Cash - Ring of Fire), studio vs. live versions, recordings of the same song by e.g. Toyah (the band) and Toyah Willcox (the person) etc.

The bottom line is that there are a wide variety of different scenarios in the MB database, and this is more complicated that it may seem at first sight.

2 Likes

So any progress on this? I really only want artist-title and original recording date for that artist for thousands of songs. I’d want to sort by music by date - e.g. 80’s music - not 80’s music remastered in 2016. I don’t really want Van Halen mixed in with Ray Blk or Rag’n’Bone Man.

2 Likes

Hi, I have/had the same issue and found nothing close to what I was looking for. I understand and respect the reasoning. However, for me it was more important to have my collection tagged with the oldest date for a recording, understanding that some will be incorrect and at some point I may have to correct manually. I can live with a small margin of error, so without further ado, here is a solution that worked for me. A buggy python script that I wrote from another example on the web.

You can run it per file script_name filename or
find ./ -type f -name ‘*.mp3’ -exec script_name ‘{}’ ;

#!/usr/bin/env python3

# Copyright (c) 2018 Kristofer Berggren
# All rights reserved.
#
# idntag is distributed under the MIT license, see LICENSE for details.

import acoustid
import argparse
import base64
import glob
import os
import re
import taglib
import time
import musicbrainzngs
from datetime import datetime

API_KEY = '1vOwZtEn'
SCORE_THRESH = 0.5

_matches = {}

def acoustid_matches(path):
    """Gets metadata for a file from Acoustid and populates the _matches.
    """
    print('\npath ', path)
    try:
        duration, fp = acoustid.fingerprint_file(path)
    except acoustid.FingerprintGenerationError as exc:
        print(u'fingerprinting of {0} failed: {1}',
                  path, exc)
        return None
    try:
        res = acoustid.lookup(API_KEY, fp, duration,
                              meta='recordings releases')
    except acoustid.AcoustidError as exc:
        print(u'fingerprint matching {0} failed: {1}',
                  path, exc)
        return None

    # Ensure the response is usable and parse it.
    if res['status'] != 'ok' or not res.get('results'):
        print(u'no match found')
        return None
    result = res['results'][0]  # Best match.
    if result['score'] < SCORE_THRESH:
        print(u'no results above threshold')
        return None

    # Get recording and releases from the result.
    if not result.get('recordings'):
        print(u'no recordings found')
        return None
    recording_ids = []
    release_ids = []
    for recording in result['recordings']:
        recording_ids.append(recording['id'])
        if 'releases' in recording:
            release_ids += [rel['id'] for rel in recording['releases']]

    _matches[path] = recording_ids, release_ids
def rate_limit(min_interval):
    try:
        sleep_duration = min_interval - (time.time() - rate_limit.last_timestamp)
    except AttributeError:
        sleep_duration = 0

    if sleep_duration > 0:
        time.sleep(sleep_duration)

    rate_limit.last_timestamp = time.time()

def calc_date(release_date, release_year):
    year = None
    dt = None
    rdt = release_date
    if len(rdt) == 4:
        dt = datetime.strptime(rdt, '%Y')
    elif len(rdt) == 7:
        dt = datetime.strptime(rdt, '%Y-%m')
    elif len(rdt) == 10:
        dt = datetime.strptime(rdt, '%Y-%m-%d')
    else:
        try:
            dt = datetime.strptime(rdt, '%Y-%m-%d')
        except:
            pass
    if dt:
        year = dt.year
        if year < release_year:
            release_year = year
            print("year: ", release_year)
    return release_year

def calc_older_date_from_acoustid(id, release_year):
    release = None
    try:
        result = musicbrainzngs.get_recording_by_id(id, includes=["releases"])
        if result:
            recording = result.get('recording')
            if recording:
                if 'release-list' in recording and len(recording['release-list']) > 0:
                    release = recording['release-list'][0]

    except musicbrainzngs.ResponseError as err:
        print("err ", err)
        if err.cause.code == 404:
            print("disc not found")
        else:
            print("received bad response from the MB server")

    if release and 'date' in release:
        release_year = calc_date(release['date'], release_year)
    return release_year

def identify_and_update(path):
    
    release_date = None
    dt = datetime.strptime('2200', '%Y')
    release_year = dt.year

    acoustid_matches(path)
    
    musicbrainzngs.set_useragent(
    "python-musicbrainz-ngs-example",
    "0.1",
    "https://github.com/alastair/python-musicbrainz-ngs/",)
    
    acoustIDs = None
    if path in _matches and len(_matches[path]) > 0 and len(_matches[path][0]) > 0:
        try:
            acoustIDs = _matches[path][0]
        except:
            pass
            
        if len(acoustIDs) == 0:
            print('acoustIDs NOT FOUND!!!')
            return False
            
        for id in acoustIDs:
            release_year = calc_older_date_from_acoustid(id, release_year)
    
        release_date = str(release_year)
    
    rate_limit(1.0/3.0)
    
    try:
        results = acoustid.match(base64.b64decode(b'Ym5jenB4cmtoOA=='), path)
    except acoustid.NoBackendError:
        print("FAIL : backend library not found")
        return False
    except acoustid.FingerprintGenerationError:
        print("FAIL : fingerprint generation error")
        return False
    except acoustid.WebServiceError as exc:
        print("FAIL : web service error (" + exc.message + ")")
        return False

    for score, rid, title, artist in results:
        song = taglib.File(path)
        #song.tags["ARTIST"] = [artist]
        #song.tags["TITLE"] = [title]
        #print("song.tags ", [song.tags])
        if release_date and release_date != '2200':
            if "DATE" in song.tags:
                if song.tags["DATE"][0] != release_date:
                    print("current YEAR ", song.tags["DATE"])
                    try:
                        song.tags["DATE"] = release_date
                    except:
                        pass

            if "ORIGINALYEAR" in song.tags:
                if song.tags["ORIGINALYEAR"][0] != release_date:
                    print("current ORIGINALYEAR ", song.tags["ORIGINALYEAR"])
                    try:
                        song.tags["ORIGINALYEAR"] = release_date
                    except:
                        pass
        try:
            song.save()
        except:
            pass

    if release_date:
        print("OK release_year: ", release_date)
    else:
        print("FAIL : no matches found")


def main():
    parser = argparse.ArgumentParser(prog="idntag", description=
                                     "Find oldest release year and update track. "
                                     "This is so we can make play lists such as: "
                                     "60s, 70s, 80s, etc... ")
    parser.add_argument("-v", "--version", action='version', version='%(prog)s v1.03')
    parser.add_argument('path', nargs='+', help='path of a file or directory')
    args = parser.parse_args()

    abs_paths = [os.path.join(os.getcwd(), path) for path in args.path]
    paths = set()
    for path in abs_paths:
        if os.path.isfile(path):
            paths.add(path)
        elif os.path.isdir(path):
            abs_paths += glob.glob(path + '/*')

    for path in paths:
        identify_and_update(path)


if __name__ == "__main__":
    main()
1 Like

Hi, thank you for sharing your code with us.
Can you please format the code by putting ```python before the first line and ``` after the last line? Right now it is nearly impossible to read or copy your code as the forum software interpretes hash tags and indentation and replaces ASCII quotes by smart quotes.

Edit: Thanks to @outsidecontext who has already fixed the formatting.

1 Like

I think this thread deserves an epilogue.

WOW! So sometimes you can get what you want…

Thanks you-know-who-you-are and everybody else involved!
Sometimes a plan comes together :wink: ((c) The A-Team)

edit:
Even if it takes three years, damn, time flies.

2 Likes

Hi hiccup, thank you very much for your epilogue, it sounds great to me.
Just bumped into this problem and saw that it was picked up by MB, but when I run Picard (2.6.2), I don’t see this field, nor is there a system variable %firstrelyear% in the scripting environment.
How did you pull this off?

For this you can use the new hidden tags:

_recording_firstreleasedate

and

_releasegroup_firstreleasedate

https://picard-docs.musicbrainz.org/en/variables/variables_basic.html

In this thread you can find an example of the script that I am using:

Original release date: Community opinion on how to handle the originaldate tag, first release date of release group and / or recording - #62 by hiccup

2 Likes