Is there a way to derive wikipedia link from wikidata link?

musicbrainz
wikipedia
wikidata
Tags: #<Tag:0x00007f076525a8c8> #<Tag:0x00007f076525a788> #<Tag:0x00007f076525a620>

#1

Since MusicBrainz has been replacing wikipedia links with wikidata links Im rather unclear if I can derive a wikipedia url from a wikidata url or not, I suppose I would like the English version if it exists.

Im interested in releasegroup and artist wikidata to wikipedia pages.


#2

Take a Wikidata (WD) page, the list of all language Wikipedia (WP) pages is (unfortunately) at the bottom of it:

It would be nice that MB would display links to languages that you have set in your profile, like the ALL LINKS user script tries to do (cf. screenshot).
But MB does already show you the excerpt and link to WP in the current language.


#3

So I cant derive a link from the wikidata url, i have to go to the wikidata page itself and then parse the page to get to the wikipedia page.

I do vaguely remember this change, surely the introduction of this change breaks MusicBrainz semantic abilities since you cannot easily provide a wikipedia link from MusicBrainz url, and nobody wants to see a wikidata link !

There are still plenty of wikipedia entries are these being kept or will they disappear.


#4

Just been reading through https://tickets.metabrainz.org/browse/STYLE-488

It seems the main driver for this was difficulties in managing the wikipedia page as they are edited/deleted/moved. Whilst I sympathize with this a bit of course you have this kind of issues with urls to any website not just wikipedia but by making this change you have then introduced a problem for consumers of the musicbrainz data.

If as it seems MusicBrainz has a system in place to derive wikipedia link from wikidata it would seem to make more sense if

  1. This was used to create missing wikipedia links from wikidata and add to the database
  2. Can be used to remove wikipedia links when they do not exist

But wikipedia urls should not be removed.


#5

You don’t need to parse the Wikidata page as a whole. Wikidata has an API and regular data dumps that can help you map Wikidata pages (they’re mostly called “Items” in Wikidata speak) to Wikipedia articles (https://www.wikidata.org/wiki/Wikidata:Data_access). For example, to get the sitelinks to the english wikipedia for items Q17 (Japan) and Q42 (Douglas Adams), you can use https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q42|Q17&sitefilter=enwiki&props=sitelinks, which uses wbgetentities and can process requests for up to 50 items at once, which might be good enough, depending on your use case.


#6

Okay thanks that is better, so the idea is I parse the json and do

https://en.wikipedia.org/wiki + replacespacewithunderscore(title)

Is the replacespacewithunderscore part of some larger character replacing logic ?


#7

I don’t know if it’s nicely done but it works, see:


#8

In fact, I’d expect most semantic web people would be much more interested in a Wikidata link, since it contains a lot more information that can be automatically used (including all existing Wikipedia links for all languages, but also all sort of other data) :slight_smile: I’d say it’s definitely an improvement over having whatever Wikipedia link people thought was useful only (so, generally English).


#9

I wrote that before Mineos link showing a way to get to wikipedia, but even that is less than ideal it requires extra parsing and not clear exactly what steps have to be done (i.e space to _) . The point is you cannot now go directly from Musicbrainz to a wikipedia page, you now can only get directly to wikidata and then have to do special parsing yourself, more cumbersome and I am not convinced it is always going to work.

Sure, it was English biased, but in the real world English is actually what most people want. So a better solution would be to keep the wikipedia en pages which is actually what most people require and doesn’t require extra parsing, but also have wikidata for access to other pages


#10

At least it is what they made Wikidata for.
So that, contrary to Wikipedia pages, they will not change address or be deleted.
You can even always fall back to another language.

I want French, like most of my fellows.
I know some other folks who are more than happy to read biographies, directly in Vietnamese.
But anyway we did not limit our WP links to English, hopefully.
We limited to English and artist’s own language.
User language is one step ahead, thanks to WD, it is possible.


#11

I specifically mean here that the link Mineo gave didnt directly give the wkipedia url just the parts to help you to create a url, so its seems a flaky process.


#12

There’s a direct link to Wikipedia at the bottom of the WP extract, titled “Continue reading at Wikipedia…”.

image
Language may not be what you want though, as MB has no way to read your thoughts (yet).
But, at least, it tries to take in account your language preferences (MB user’s one, browser’s ones, see MBS-8417).

It was recently improved by MBS-8417


#13

Sorry I meant you cannot go directly from MusicBrainz data (i.e data provided by the MusicBrainz database or webservice) rather than the MusicBrainz Website

But since you have a solution in MusicBrainz website, how it this done ?


#14

#15

Yep, sorry i thought you were talking about the website… lack of sleep, my bad.


#16

Not really just the parts to build an URL flaky process.
If you load Wikidata (like in my example I sent), you get the FULL English URL (if it exists) by wikidata.sitelinks["enwiki"].

But I you already now have several examples to study.
But sure, indeed, you have to query Wikidata to get any Wikipedia links.


#17

thanks I need to study it more, I was looking at the json output from Mineos example and this did not seem to return a url.


#18

https://www.wikidata.org/wiki/Special:EntityData/Q5383.jsonsitelinks["enwiki"].urlhttps://en.wikipedia.org/wiki/David_Bowie :slight_smile: