Setting up a database image: where to start

I’m thinking about loading a musicbrainz database image onto my (Linux) computer to play around with writing some SQL queries. I’m not interested in installing the Musicbrainz webserver setup.

What would be the best place to start for this?
I see some potentially useful instructions at https://github.com/metabrainz/musicbrainz-server/blob/master/INSTALL.md
and some potentially useful instructions at https://bitbucket.org/lalinsky/mbdata/

For someone like me who doesn’t want to install any more stuff than necessary – would one of those two places be the best place to start, or is there somewhere else better?
It seems like some of the instructions I’ve seen may only work if I install everything.

1 Like

MBData used to be the way to go to do this, I’m not sure if it still is. I know we run the MB database in separate docker containers nowadays, maybe there’s a docker setup for it as a standalone, but I have no idea - @yvanzo or @bitmap might know.

2 Likes

The options you have are:

  • A VM running docker and the full server
  • Docker images running the full server
  • The server code downloaded from github and configured yourself
  • MBSlave - a tool for replicating just the data

These have advantages and disadvantages and it depends on what database you want to use and if that database is on a real os or inside a container.
The easiest way to set up a mirror would be to use docker-compose and using that to build a set of docker containers and once you have this running you can stop some of these containers.

5 Likes

Just confirming both above answers with a bit more details:

The mbslave script now bundled with mbdata is explicitly made “for managing a replica of the MusicBrainz database”, but it also mentions that “if you don’t need [database] customizations, it might be easier to use the replication tools provided by MusicBrainz itself”, which I don’t know of (never used mbslave so far). PostgreSQL server should be provided by/will run on your system.

Alternatively, musicbrainz-docker comes with basic instructions (just see edit below and ignore “Build search indexes”) but any customization requires to handle docker-compose, e.g., you can optimize it by removing “search” and “indexer” sections (and dependencies) from docker-compose.yml even before initial “sudo docker-compose up -d”. PostgreSQL server will be provided by/run in a docker container.

Edit: Querying PostgreSQL in db container from your system will require to enable {{docker-compose.public.yml}} by setting COMPOSE_FILE variable to “docker-compose.yml:docker-compose.public.yml” either from your shell or in .env file, so it is not really straightforward.

1 Like

Just a suggestion, try doing the full install, get it running the way you want it, and then un-install/disable the applications you don’t want one at a time, and if it breaks anything you’ll know you need it and have to re-enable/re-install.

Just a suggestion.

2 Likes

Okay, so the gist I’m getting is that what I want to do is not especially easy, and there’s no well-tested process for doing it. On the other hand, nobody has responded that it’s a terrible idea, so there’s that…

So, a more specific question – my linux distro comes with PostgreSQL 11.4 as the standard version of PostgreSQL. Has anyone tested musicbrainz stuff on 11.4 ? Am I just making things harder for myself by using such a recent version of Postgresql? I have no problem switching to an older version if there’s reason to think it will make the process easier/smoother.

There is a minimum version of postgres 9.5 as the server is using some of the features introduced in that version.
There should not be any problems running on a more recent version.

If you are using your operating systems postgres database there are 2 database extensions that the server uses: postgresql-musicbrainz-unaccent and postgresql-musicbrainz-collate
The source code for these is in a linked git repository so you need to checkout the submodules from git to get the source for these.
Once you have the postgres development packages installed you should just need to go into the directory and run make install to compile these.
Ideally you should have the one postgres version installed on the sever as you need to tell the make commands what version of postgres you are using.

See https://github.com/metabrainz/musicbrainz-server/blob/master/INSTALL.md for the full steps on how to setup a database and server if you want to use the postgres that comes with your os.

1 Like

AFAIK, MB Server has not been tested with Postgres 11 yet but plans are to move to Postgres 10.
Note that slave server should prolly encounter less issues than musicbrainz.org, if any.

1 Like

Topping this, as I’m currently considering getting back to this, and considering attempting to load MusicBrainz into Postgres 15, as that is the version of Postgres that is recommended by my distro.

I was told last night on IRC that the MusicBrainz DB works fine on Postgres 15, so I’m feeling optimistic.

Since last year, the MusicBrainz Docker Compose project supports setting up a Postgres database-only mirror.

For a fresh install, closely follow the documentation and use the command containing alt-db-only-mirror mentioned in the beginning of the section “Installation”.

Note: Make sure to stop the Postgres server from your distro beforehand, or to use a different port number.

2 Likes