Userscript to parse copyright notices

Discussion and support thread for my new userscript:

MusicBrainz: Parse Copyright Notice

Install Source

  • Extracts all copyright and legal information from the given text.
  • Automates the process of creating release-label relationships for these.
  • Also creates phonographic copyright relationships for all selected recordings (userscript only).
  • Detects artists who own the copyright of their own release and adds artist relationships for these (userscript only).
  • Caches name to MBID mappings, so you don’t have to select the same label or artist twice.

I have also created a GitHub wiki page for this userscript to collect different copyright notice formats. @tigerman325 has been a great help with testing and finding these so far. You can also help by testing the userscript and discussing features on GitHub and in this topic.

7 Likes

Do I assume this means I need to type in that block of text in full? I don’t trust myself to avoid typos.

Now merge it with an OCR tool and it could get interesting…

You can do that, just typing (C) 2021 Labelname should already save you a bit of time (add relationship dialog opens automatically, selects the correct relationship type, fills in the date and triggers the auto-complete for the label).
But the primary use case is to paste copyright notices which you can find on other websites and which a-tisket lazily adds to the release annotation.

That’s actually one of my dream goals, to combine the credit parser and importer scripts with OCR to get the data out of the CAA:

6 Likes

Will (P)(C)2021 labelname be clever enough to add both in one go? That would have me jumping in an using this on CDs. I don’t use Tisket as I generally avoid digital media outside of Bandcamp.

But if you are doing something to get rid of those annoying blocks of text - well done!!

Almost, currently the parser requires either a + or an & between (P) and (C) but I should probably make that optional.

Edit: Added support for the variant without separator (and a few other edge cases) in version 2022.1.6.

2 Likes

I can remember to use an &, but I would think it not too hard to parse a double bracket set like that?

I have recently started to write a bit of documentation for this userscript as the code quickly became more complicated than I initially expected it to, thanks to the reports and suggestions of @tigerman325 and @vzell.
You can find it on a GitHub wiki page – feedback, corrections and improvements welcome (you should be able to directly edit it with a GitHub account):

I also wanted to take the opportunity to share this beautiful railroad diagram of the underlying regular expressions, which are used to parse the copyright notices, with you:

Copyright Parser RegEx Railroad Diagram

I would be very interested to know if this is also helpful for people without a coding background to understand what kind of text the copyright parser supports.

5 Likes

I’ve found an issue with the script.
If I swap the entity type manually while parsing a copyright notice using the script, it will give me results but I won’t be able to submit them. Clicking the “enter edit” button after this leads nowhere (it will act as if it’s submitting the edits and then stop).

Sorry, but I can’t reproduce this. I did a quick test with “(P) Test Artist”, clicked “Parse copyright notice”, changed the entity type from Label to Artist, selected the correct rel type again, searched and selected the MB test artist, confirmed the dialog with “Done” and hit “Enter edit”.
Although this is not the intended way to use the parser, it works for me (on the test server).
If you want to force an artist search instead of a label search, you can hold SHIFT while clicking “Parse copyright notice”.

If you did something else which did not work, I need more details (description or screenshot of what you did). Is there an error message on the page or in the browser console when you click “Enter edit”?

Thanks! That worked :sweat_smile:
I think this should be added to the documentation, if it isn’t there already.

1 Like

Apparently the userscript works reliably now as I haven’t received any improvement suggestions during the last months :innocent:

Although there aren’t any new features in the classical sense, I’ve finally released a new version today which also supports the new React relationship editor (currently in beta).
So if there are any MBS beta testers among you which had to avoid the beta relationship editor because of all the broken userscripts, here’s at least one userscript that’s working again. (Another one is my Voice Actor Credits script which is using the same codebase.)

2 Likes

Latest release of this UserScript (2023.4.16) doesn’t “Autofocus the parser on page load” even if that boolean is flagged

1 Like

Indeed, I can reproduce this bug which affects both the copyright and the voice actor credit parser userscript. Thank you for reporting!

This used to work in the past, but now there seems to be a timing issue: The previous value of the checkbox had not been restored by the time the check for its state is performed.
I did not have the autofocus checkbox enabled recently, so I don’t know if this has been broken since the last userscript release or whether there was a recent MBS change which slightly changed the timing.

The issue should be fixed in version 2023.10.27.

I can confirm its fixed for me … thx a lot