Echo Nest, Spotify, and the Rosetta api

Tags: #<Tag:0x00007f0509628d78> #<Tag:0x00007f0509628ad0> #<Tag:0x00007f0509628800> #<Tag:0x00007f0509628580>


So, as detailed in the April Fool’s blog post, the Echo Nest is finally being absorbed into Spotify and thus, are depreciating their API while adding a few more features to the Spotify API. However, they aren’t migrating all functionality.

Critically, they are removing support for their Project Rosetta, which they have removed almost all mention of in their documentation. This api allowed for translation of artist IDs across different services (Spotify -> MusicBrainz, MusicBrainz -> LastFM, ect.) I have been developing an application for a while to which a variety of metadata across different services is the main function, and without this API I will not be able to fetch any supplemental information from MusicBrainz IDs, which I start with.

Now I understand Spotify as a corporate entity is incentivized to keep as much traffic away from competitors as possible, but this move seems a little greedy, no? Does anyone in the community know of a similar API which includes MusicBrainz IDs and a few other big services? Or potentially know how exactly they generate the Rosetta functionality? (I assume it isn’t by hand). If not, 6 months of work is basically down the drain for me :frowning:

Pandora links not recognised?

MusicBrainz could itself function as a kind of “Rosetta Stone”, I guess? We have ISRCs, ISNIs, IPIs, ISWCs, bar codes, and label codes directly attached to their relevant entities. In addition to that, we have URL relationships which allow linking to Discogs, AllMusic, Spotify (not sure if it’s the actual Spotify ID in the Spotify URLs though?), VIAF, WorldCat (OCLC IDs), Wikidata, iTunes, and a number of other external services which has the ID in their URL.

Of course, this is far mostly manual input - though some of it can be semi-automated with e.g. user scripts running in the browser or automated by setting up a bot.

Edit: Also note that we have access to a data dump from ISNI with their ISNI→MBID mappings, which could be fed to a bot and automatically fed to MusicBrainz if someone were to write such a bot; see Notes from #MetaBrainz Meeting on 2016-04-18

Microsoft will stop selling and streaming music from their store on 31 December

Can you scrape the Echo Nest for that data?
Or is it getting too late for that…


Well, seems like they have a rough rate limit of about 120 per minute. So, if I wrote a script now and ran it 24/7, I could manage to capture every artist currently in the MusicBrainz database. However, a lot of the features are dynamic (like related artists) and that wouldn’t hold up well as new artists are introduced over time. Also, it’s pretty explicitly forbidden in their TOS.


Thank you for pointing out the ISNI dump! I’ll poke around at the other services I’m using and see which ones support it!


I think ISNIs are mostly a library thing, so not sure how many other (non-library oriented) services will support them.


You’re right. This is a real shame that this resource is going away. However it makes sense that Spotify no longer needs the resource, as the Echo Nest now has access to spotify’s content and doesn’t need to have a mapping between all of the other companies that it works with.
It’s worth remembering that rosetta stone was probably always an internal project developed for their own use which they made publicly available. The goal probably wasn’t to provide a free mapping for the good of the community. It’s always worth remembering that commercial companies have an agenda which may not reflect what they do in public.
I performed a “backup” of mapping data for the songs in the million song dataset which I will be making public soon, however that’s a tiny subset of all of the tracks which the Echo Nest knows about.

I know of a few projects which do this mapping, which I’ll publicise once I know I have permission to do so, but otherwise I can only echo @Freso’s suggestion of making sure that we add this data to MusicBrainz.


On a similar note, If we add fields for Google Play Music {Artist,Album,Track}_IDs, there will be a programmatic way to represent music across different services.