Implications of AI generated content

Continuing the discussion from The General Chatter/Off Topic Thread:

AI generated content is making great progress lately and has become quite easy to “create”.
E.g. Stable Diffusion for images, ChatGPT for text and now music too.

What will this mean for MusicBrainz? (And other MetaBrainz projects such as BookBrainz for that matter)

@reosarevok and @rob meant to talk about this on IRC last week but there doesn’t seem to be a conclusion yet. (Some other IRC users commented on this subject though)

Actually such entries already exist in the database:

Relatedly such tools are also used for upscaling cover art or voice cloning (keyword: ElevenLabs) and synthesis (e.g. Apple audiobook narrator).

Content focused sites such as Pixiv or Danbooru have already banned AI art or strictly require it to be tagged.

4 Likes

@chaban : My original intent, before getting derailed, was basically to start this discussion. Thank you for following up!

AI generated music is only days, perhaps weeks away, not years. I expect most of it to suck at the start – early clues suggest that lyrics are difficult for AI and I expect we’ll find other short-comings as well. But, I suspect that electronic dance music without lyrics will be the first “genre” to get acceptable AI generated music. Coming very soon.

Collectively we need to work out how to handle this, because we can clearly see that some problems are headed toward us. If ChatGPT is any indication, if people can enter some text and create music, then a wall of new music is going to be created, including a fair number of charlatans who are going to pass off AI music as their own. And these people will turn up at MusicBrainz and try and index this music as their own creation.

And yes, if they typed in the prompt to create the track, one could argue that they are the artist and the AI is their instrument. That doesn’t feel wrong to me. But at the same time, it doesn’t sit well either.

I’ve been on the sidelines of the music industry long enough to remember old dinosaurs from the industry lament that electronic music producers are not “real musicians”. But, it is clear that these DJs possess quite a lot of skill to produce this music, so who is to say who is “real” and who is not?

We find ourselves at such a crossroads and we could be the “get off my lawn” folks and want to just keep out all AI. That doesn’t seem right, nor does trying to index every MusicGPT result to ever come out of it.

We’ll need to find a nuanced approach to many sticky questions. Let me list off the questions I already see:

  • List item Are “MusicGPT” user considered artists? Why or why not?
  • List item If a MGPT user comes to MB and enters their tracks, what should our response be? Clearly some tracks are going to be worthwhile to index because at least some people will love them. But a whole pile of them will be rubbish. But since beauty is in the eye of the beholder, where do we draw the line?
  • List item Do we have a different response to people who are passing off MGP output as their own musical creations (e.g. fraud) and people who are direct and honest about it?

What other tricky questions do you see coming? Perhaps a good intermediate goal is to collect questions that we feel we need to answer. I doubt we’ll find many answers immediately, which I think is ok. I would think that they main purpose of this thread is to start being aware of the questions and issues.

We can then see how the world reacts to these issues – court cases are already flying with few clear ideas as to how they might play out. Fortunately the text and image AI fields are going to spawn lawsuits and difficult decisions before music arrives in the same predicament. We may have some time to really wrap our heads around this issue before we’re forced to make policies and enforce them.

Fingers crossed!

9 Likes

We have people with Synthesizers that pretend to be a whole “Orchestra”. Is that “fraud”? Or just being economical with the musical truth?

Could MusicGPT be seen as a type of Instrument? It is a synth on steroids. Electronic Music with a random number generator attached. You listen to the machine algorithm in the same way you listen to the notes on a trumpet.

As a database it would be good to being able to “see” the MusicGPT stuff. If it is declared. But that guy with his synth who calls himself an Orchestra is not telling us he is using a Synth so currently hides in plain sight.

6 Likes

To me, AI-generated music is still music, so there’s no question it should be indexed.

The question is whether an human is still behind or not, AFAIK, today’s AI generators are still human-driven, though it is perfectly possible AI generators may be AI-driven, to a point it is hard to credit any human.

To me it leads to 2 questions:

  • is an AI music generator an instrument (“sufficiently” human-driven, to a point a human can be credited) ?
  • is it an artist (no human can be credited) ?

Now, if we consider AI as an artist, should we differentiate them? In this case, we just need to add a new artist’s type, and let editor add a name for this AI (“disco music generator 2.1”, type “AI”) with eventually an extra relationships (AI driven by or such).

Of course, humans may lie, crediting a work to themselves, while it was AI-generated, but that’s not really our problem (human artists were credited for works they never did, we have plenty of examples).
And of course, some will use generators for the sole purpose to create contents in order to generate traffic (we already have fake releases/artists).

Legal issues aren’t really our problem either (AI will lead to a LOT of legal issues, we’re just a music database).

9 Likes

It seems to me like one of the core problems will be that of volume. AI-based tools make it easy to produce essentially limitless amounts of content. The guidelines provided in Notability check before adding a release to MusicBrainz are just “the music needs to exist”.

If someone sets up a model that continuously writes MP3 files to a publicly-available server (maybe with GPT-3 generating song and album titles and Stable Diffusion or DALL-E generating album art), is all of that fair game for inclusion in MB if a human (or automated system, pursuant to the bot policy) is willing to add it?

6 Likes

I expect they would build a submission bot to operate at the same time… merge it with the SEO bot. Keep the Wikipedia page up to date at the same time… :robot: Upload to bandcamp\spotify\etc. Another bot farm then plays the tracks to get registered in the charts… Who needs a human in that chain?

Which is part of the question. We may talk of categories, or setting an instrument, but policing it could get out of hand. Many releases added to the database barely tick the most basic of boxes. It will be hard to get these new submissions to tick “instrument: MusicGPT”

Yes. The current definition Artist - MusicBrainz is pretty wide and can be extended to include AI user:

An artist is generally a musician (or musician persona), group of musicians, or other music professional (like a producer or engineer). Occasionally, it can also be a non-musical person (like a photographer, an illustrator, or a poet whose writings are set to music), or even a fictional character. For some other special cases, see special purpose artists.

This definition isn’t exhaustive anyway: any credited individual is eligible as “artist” in MusicBrainz, not only.

Nothing special really. There has always been rubbish, it doesn’t have anything to do with AI.

The use of AI can already be indicated in the same way as for other software, by using a special artist of type “Other” . For example, see Vocaloid artists including the popular Lily - MusicBrainz. However we don’t have clear guidelines about it, it is just de facto common practice. By the way, given the number of Vocaloids, and upcoming AIs, it might be worth to have a type of artist “Software”?

For mistaken creation credits, including fraud, there is Relationship type / Previous attribution - MusicBrainz.

1 Like

I believe yes. I mean, we credit artists for programming digital instruments, so I don’t believe this should be treated much differently than that (though maybe with a new type of relationship). if we have the situation mentioned above, where an AI fully creates a thing, then they should be credited as the artist

I’ve been holding this quote from the About page pretty close to my heart recently:

As an encyclopedia and as a community, MusicBrainz exists only to collect as much information about music as we can. We do not discriminate or prefer one “type” of music over another, and we try to collect information about as many different types of music as possible. Whether it is published or unpublished, popular or fringe, western or non-western, human or non-human — we want it all in MusicBrainz.

with this in mind, I’m not sure if we should draw a line. just like some people might enjoy speedcore or data sonification today, some people might enjoy AI music tomorrow.

I’ll second what @IvanDobsky said above, and treat them the same way we treat a solo orchestral producer. we’ve already got plenty in the database, like Makkon, RoomVR, and Heartsong, just to name a few


I’ll cap this off with a recent video by Adam Neely where he talks about AI from a musician’s perspective (just the first ⅓ of the video or so). he talks about how this is only the most recent of the “revolutions” in music technology, the other two he brings up being the advent of recording technology and the birth of MIDI. it’s interesting to hear a quote by John Philip Sousa talking about Thomas Edison’s phonograph the same way the media today talks about AI…

1 Like

Great convo!

I came to MusicBrainz because it’s broad. We have a lot of users here who are interested in niche things - even things that other MB users raise their eyebrows at. YouTube videos, gamerips, bootleg compilations, spending days filling out complex classical music relationships or differentiating Pink Floyd recordings… It’s what sets MB apart. Otherwise I’d be on Discogs quite frankly, their popular music coverage is great.

If there’s a flood of completely new/different music I don’t think MB’s question should be “will we allow it?” I think it should be “how do we allow it?”

On that note I think we are very well-equipped to deal with this. I imagine we can have a relationship like ‘prompted…’ for the ‘artist’ that prompts the bot. Do we really care if the bot has thousands (or millions!) of recordings, if someone’s inclined to enter them properly?

Particularly in the early days, being the place that captures this piece of music history, as it happens, would be really cool. Even if it ends up that we’ve nicely catalogued the start of the end of humanity :stuck_out_tongue_winking_eye:

6 Likes

I would expect that the artist would be the prompter and the bot would be the instrument, but other than that, this seems fine to me. New instrument “deep learning bot” or something (can we not call it AI? There’s nothing artificially-intelligent about it at the moment, it’s not like it passes the Turing test), new relationship “prompted” that allows using the instrument, with a credit for the specific bot, and that should be it.

7 Likes

This is a very interesting point!

When my friends are posting “AI generated” images they’ve prompted, on social media, I never think of them as the artist. I was extrapolating that to music. The differences in what’s generated/the type of results depend much more on the bot than the prompter. That’s assuming popular music generators will work like the image bots do (just type in a few words, get a track of music).

There’s an interesting example in the MB database where there’s a set of voice generated ‘characters’ that people use, and when they use them they credit them as artists, usually with ‘feat’ (even though the words have been written for them to ‘read’). Can’t find it atm… anyway, not the same, but related.

It will become clearer with time, but fun to theorize.

are you talking like, how siri rap producers use TTS/Text-to-Speech voices? examples 1, 2

if people are actually crediting the voices they’re using, it’s a bit like the Vocaloids mentioned above, lol


to keep a bit more on topic, I don’t yet have an opinion on how to credit these AI/Deep Learning Bots, but I think we should be open to both ways mentioned here, AI as an artist and AI as an instrument, depending on how people actually start crediting these AI. whether they’re credited like the fake orchestras mentioned above (as an instrument), or like Vocaloids and TTS rappers (as an artist)

maybe AI and Vocaloids could be a case for a new artist-instrument relationship, denoting two items as the same?

1 Like

Agree with this 100%. Let us not call this new Algorithm “AI”. Otherwise we need to call a Synthesiser sentient.

A human will initialise an algorithm in the same way someone pokes a Synth to make a new noise. The machine does not create music. It just just fed a seed to initialise a musical output. Don’t be confused by marketing.

2 Likes

Would there be any concern about server resources either with MB or LB if there was a flood of new data from AI’s generating and adding so many recordings? Honest question, I have no idea what if any impact that might have, but that is the only concern I personally have. Wouldn’t want it to impact the performance of the site.

2 Likes

The human typing “face in the style of van gough” into an image generator is surely less of an ‘artist’ than a human drawing something using a digital tablet/brush? I think this translates to future music bots as well.

Maybe we could credit all the developers of the bot, or perhaps all the artist in the dataset that was fed to the bot :stuck_out_tongue:

2 Likes

I see nothing against crediting the bot as the artist. Just the human needs some credit for coming up with the seed phrase and decision to start the process. Producer? Arranger? Initial Idea?

For now, in most cases, someone needs to press the initial button. And there should be space to put them somewhere into the chain of credits. They will name their output “Charlie’s Magic Band” or whatever they feel like. And that should be how we credit it. Where we know who is behind it, they need a credit somewhere.

Ditto on the Work side. There is a level of “collaboration” here. The algorithm did the real work, but it from an idea. As to the dataset? That would be a “based on” relationship.

And no, the devs of the algorithm do not get credited. We do not credit the person who constructed the Synth for tunes made with their instrument.

It is my understanding with these AI tools that the same prompt will not always produce the same result; potentially not even a very similar one. That to me makes its function not that of an instrument. So I would go so far as to say that the entity providing the prompt deserves very little credit in most cases.

As a separate thought, I do not feel the works used for the AI’s training set should be ignored; any pre-existing work used to generate an output should, to my mind, be credited.

3 Likes

I never came across anything like that but, IMO, it’s more simple to completely ignore that, for the time being.

Some additional thoughts…

Prompting definitely affects the output, but the output with many algorithms (such as stable diffusion) are non-deterministic, meaning the same prompt will generate a different output if repeated. Also worth nothing I think is the fact that the AI is generating the output based on training from millions of existing works; this is synthetically “standing on the shoulders of giants” at scale, and I wonder how different that is from humans creating based on their own influences.

As pointed out in this thread, the number of recordings is effectively unlimited. So, as also pointed out in this thread, it seems that a notability bar should be set, otherwise, I fear the wheat could get lost in the chaff, so to speak.

I think the nuance here, repeating a previous point, is the machine synthesizes a new work based on existing works, in much the same way that humans perform the act of creating work. And many do so differently every time they’re prompted to. This makes the AI-as-instrument argument far harder for me to swallow.

I think this is a very good point. Somehow the level of effort an “artist” exerts (be it prompter or AI) should at least be an input into its noteworthiness or valuation.

Except synths don’t “create” music, and AI arguably does.

4 Likes

And as though on cue, an article reported by Reuters about ChatGPT books generated and sold on Amazon, so also a #bookbrainz concern:

3 Likes