Due to my lacking programming or database management skills, I am running into problems extracting data from Musicbrainz for my academic research. Now I’m wondering if someone here could help me out by either extracting the data I need for my research or guiding me through the process (description of problem below).
Based on the database schema and the information on the main website, I think it would be easiest to extract three different sets that I can easily match myself, and combine with the data I already acquired from different sources. Ideally the sets would look something like this, but I can adapt and limit based on what is easy to extract: artist level data (id, name, type, gender, begin/end date(s), nationality/area), album level data (year, label, artist(id), rating), song level data (ISRC, year, label, artist(id), album/single/first release).
I am a PhD researcher at the economics department of KU Leuven university in Belgium. I’m conducting an econometric analysis of the impact of music piracy on music sales and live performances. I have collected datasets of digital song sales and concerts in European countries between 2008 and 2015. I am looking to complete these data with characteristics about the artists and their releases in order to help me determine to what extent illegal downloading affects different segments within the music industry, which will be the main contribution of my work.
While lots of trial and error resulted in finally being able to (seemingly) access the database through the pre-configured server with Virtualbox, I just don’t seem to be able to figure it out from there. Most forum posts and the Github entries about the database server go over my head in terms of prerequisite programming or database management knowledge. I got to the point where I am now thanks to this forum entry: Trying to access MB database through pgAdmin and VirtualBox. If someone could dumb it down for me how to go from there to accomplish my goals, that would be extremely helpful as well.
My research is purely academic and non-commercial. I aim to publish all my analyses. I would be immensely thankful if someone could help me out, as I’ve been struggling with this for a while already.