PSA: we might have accidentally leaked your email address and birth date :(

I am not familiar with this phrase, but a quick Google tells me something like " Stop acting like an idiot or talking rubbish". I’m sorry if I upset you, but I don’t think I was talking rubbish. I don’t know anything more about the issue myself except for what was written here or on the blog. I just tried to answer your question.

My apologize if I misunderstood the phrase.

3 Likes

Addressing the blog post:

The delay of the notification to everyone until after a fix was in place makes good sense to me.

I’m not seeing anything about a sweep of code to ensure that something similar is not still in place.

This is concerning.

I’d like management to address this issue in depth.

And I apologise for over-reacting to “Obviously”.

2 Likes

Mmmh, ok. I didn’t get this. I meant this more in a sense of “seemingly” or “as it looks like”, as the blog post does not talk explicitly about a time range. I think this is just me not being a native speaker. All good again :smiley:

1 Like

Question: I see people talking in other forum posts about how they can “Download a copy of the MB website to install on their own server”.

Can I get a confirmation that the leak would not be in that code? I assume those people with a clone of this website would not have our email and DoB data?

Thanks.

5 Likes

I will take this a step further, but am admittedly talking about stuff that is way out of my league -

Since the data only appeared when running JSON:
If someone downloaded MB and ran it through JSON, will the data appear from the downloaded version the same as it did here.

I guess development or downloadable server has an anonymised database of users, llike the test.musicbrainz.org.

If you go there and log in (with your user account, but password is just mb). Then you go to edit your profile, and you will see that your email address is empty. As well as your genre, date of birth, bio, everything except the user name.

1 Like

My 2 cents: The notification on the forums here and the small link on the frontpage to the blog isn’t enough for notifying users. I bet a lot of people never check the forums and rarely look at the frontpage of the site, let alone the small blog post links on the side. If sending email alerts would be hard to do due to spam, I think a site-wide banner notification about it (like the one you see when you have new edit notes) would be a good way to notify users. It may be possible to tune it as well so only people who might be effected see it.

6 Likes

According to MusicBrainz Database / Download - MusicBrainz
The datafile mbdump-editor.tar.bz2 contains:

This table includes non-personal user data about the people who’ve enacted the edits enumerated in the database above.

I’m not sure if “non-personal” includes our email address. If you want to check it yourself, you could DL and have a look into this file: http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20201125-001801/mbdump-editor.tar.bz2 (file size about 92MB)

1 Like

It does not :slight_smile: This data is only in our own server, we never dump it (we do have a backup, but that’s about it). So, it can only be leaked if we ourselves do something stupid, like here, but not by any other means.

10 Likes

We were first emailed by a user on November 5 informing us that they had changed their email on MB to an address that was not used anywhere else, and it had started receiving spam. So on the same day as receiving that mail, we:

  1. Did an audit of our servers and found no signs of intrusion
  2. Did an audit of our data dumps and found no email addresses in them
  3. Set up a honey-pot email address on a test user to see if it started getting spam

However, we noted that their updated email address was almost identical to the old one, and posited that the previous email had already been leaked by other means and spammers might be using an automated guesser to reach the new address. On November 6 we asked them to try changing the email to something much more random and watching if spam appeared again.

At this point I didn’t start a full audit of the musicbrainz-server codebase, though outside of auditing the data dumps I of course reviewed the obvious places like user profile pages that might somehow contain emails. The reality was that almost every page on the site deals with editor objects in some way, so it was hard to guess where the leak might be without doing a full audit (and the issue indeed turned out to be a place I didn’t expect).

While I regret not pursuing a full audit sooner in hindsight, I was then satisfied with the request to try a more random address first. We also had assumed that if there was a widespread leak, more than one user would’ve contacted us. It later came to my attention (today) that someone had reported something similar back in August. If I’d seen that thread and started investigating it then, the timeline would likely look a lot different. I regret to say that I don’t follow the community forums very frequently, but I’m going to try to do that more often – and if anyone finds a similar issue in the future, emailing us directly in addition to posting on the forums would be helpful just in case.

On November 7 they mailed us back saying they updated the email to a much more random address. Then on November 22 they mailed us again to say that, unfortunately, the new address had also started receiving spam. This was the first email I saw when I woke up and I immediately started a full audit of musicbrainz-server.

Once I found the issue with annotations, it was hotfixed immediately, but I continued reviewing all places we send editor data to the templates. This took about 8 hours, but by the end I was highly confident there were no other cases of exposure (and could finally eat something).

Yes, the blog post on November 23 was the first public disclosure of the leak.

There is nothing higher priority for me right now than ensuring this doesn’t happen again. I’m still actively working on reducing the amount of editor data we handle in our templates overall and improving automatic detection of leaks during testing and development. Improvements to editor JSON handling by mwiencek · Pull Request #1809 · metabrainz/musicbrainz-server · GitHub is a start, but I don’t plan to work on anything else this week and possibly next.

18 Likes

For what it’s worth, I write annotation on almost every release due to a-tisket and I haven’t had one spam e-mail that I believe was a result of MB. So, hopefully that’ll continue.

1 Like

Quick update from me – we’ve got a request in to a GDPR versed attorney to advise us on what else we need or should do. Once we get our questions answered, I’ll take the advice to our board of directors for review and after that we will take action on outstanding issues.

On my radar for outstanding issues:

  1. Do we need to notify any governments about the breach and if so, which one?
  2. Should we email affected users?
  3. Should we make a site-wide banner for a few days to advise users who are not following here or the blog?

However, involving attorneys does slow things down, so it will still take some time to resolve these final questions. I’ll post more information as I receive it.

Thanks!

13 Likes

Note that not all affected accounts have been actually spammed. It seems likely that spammers did not even find most of them. Instead, they most probably just crawled the website at random to gather personal details from webpages source code without having an understanding of how the leak was working.

In order to inform all potentially affected editors, we will be sending special notification mails during this week. If you don’t receive such mail by next week, then your account is not affected at all. We will update this topic when the notification campaign is ended.

Update: All mails have been sent by December 21, with the exact count of webpages having exposed editor’s data. We don’t know if these pages has been accessed and scanned by a spammer though. If you did not receive or lost this mail, just contact us for manual confirmation.

13 Likes

I wanted to give you all an update on where we currently stand with regard to the open questions surrounding this breach.

Legal advice: We have received legal advice on this topic and it has been confirmed that as a California organization we are not required to file a breach notification in the EU, even if EU citizens’ data was leaked. However California law stipulates that if a data breach affected more than 500 Californians, that we are required to file a breach notice with California.

Breach notification: We have no way of knowing if more than 500 Californians were affected, since we do not know where our users live (or have their legal residence). If we assume that ⅓ of our users are in the US, ⅓ in the EU and ⅓ elsewhere, and if 12% of the population of the US lives in California, then possibly more than 500 were affected. This math is iffy at best, but we filed a breach notification with the state of California just in case.

Notifying users: After an internal discussion we decided to email each of the possibly affected users and send them the same notification that we’ve posted in this post. It will take up to a week for our script to send all of the users email, so stay tuned. We will post another update when this process is done. Since we’ve decided to notify users per email, we will not be putting up a site wide banner with the notification.

Let us know if you have any more questions!

P.S. This is all the proof we have that we filed a breach notification:

18 Likes

Your lengthy and detailed reply increases my confidence that MB realises the seriousness of the failure to protect identifying user data.

Thank you. It has been easy to give MB a miss while my confidence about that was low.

5 Likes