How difficult is it to setup a local MusicBrainz server?

hiccup · January 18, 2020, 3:04pm

I read about the possibility to setup your own local MusiBrainz server.
It might be an interesting thing for me to try, but…

While I am savvy with Windows, my experience with- and talent for Linux is less then mediocre.
In the past I have build a nas-server running on NAS4Free (freebsd based), and I managed to keep it running and updated for many years, but using ssh, console and related Linux commands where a big challenge and a kind of self-torture for me.
(I abandoned it after a while and am now using a Windows Server nas)

So, my question is this:

For as far as I understand after a quick-read, I could setup a Windows pc for the task, use VirtualBox , and then load some sort of a largely pre-configured MB database image?

Considering my above admittance of being terrible with Linux, and usually only succeeding after lots of help from experienced users that are willing and able to give some sort of spoon-feeding support, would it be better for me to just forget about even trying this, or is it not that difficult and would I have a chance in getting it to work?

dns_server · January 18, 2020, 11:58pm

What are you trying to achieve by running your own server is the question I would be looking at.

Do you want a replica database that you can run sql queries?
Do you want a replica database and server that you can point picard to and query the api?

The vm is designed to allow you to download and run it but you need to be a little mindful of how the network works on the vm.
You will be using NAT so the vm will sit inside it’s own network and be able to connect outwards to download replication packets without too much work.
Configuring the networking in the vm to forward certain ports needs to be set up to allow your pc to connect into the vm.
The vm also uses a technology called docker and this is doing it’s own networking so you need to run a few commands inside the vm to expose the ports for the database if you want to connect this.

If all you want is a sql database https://github.com/lalinsky/mbdata may be an option but I do not have experience with this.

InvisibleMan78 · January 19, 2020, 12:00am

If you try to setup a local MB Server on Linux (like Ubuntu) it’s very complicated to get it running from the source code.

BUT:
You can use the (nearly) ready-to-use Virtual Machine in your mentioned VirtualBox. There is only one drawback: This VM is highly outdated and you need some manual steps to get it up, running and actualized.

hiccup · January 19, 2020, 9:11am

That’s a very good question, and the answer is: I am not really sure.

I wasn’t able to find a layman’s explanation on why you would want to create a local MB database, so I started guessing.

My idea/intention is this:

Once in a while I would want to do a streak of updating my music library through Picard.
The implemented delay of 1 second per track can already make that a time-consuming job.
But especially with classical music and using some plugins, it will take a lot longer.
(that probably has to do with lots of ‘relationships’)
And the same goes for separate tracks and compilation albums.
They can also take a very long time to get matched.

So I am guessing that if I had the MB database locally, most of that delay would disappear.

Is this a correct assumption?

InvisibleMan78 · January 19, 2020, 9:43am

Yes, but not by default. You have to adjust this rate limit in Picard too.

And you need a very fast machine with a lot of RAM to get your results faster locally then from MB.

hiccup · January 19, 2020, 10:31am

Thanks InvisableMan78,

So if speed gains are not that obvious or easily achieved, what are other reasons for users to setup a local MB database?

InvisibleMan78 · January 19, 2020, 10:40am

In my case there are 2 main reasons:

a) I bulk tag my files with my own commandline tagging program, direct accessing the local MB database (query the VM version with SQL queries) getting only the few metadata I really need.

b) I fill an additional SQL table in the local MB database with all my local song filenames to access them more easily (search & play).

nadl40 · January 27, 2020, 4:39pm

I would like to add main reason why I use local vm, at times I don’t have web connection at remote place that I often visit.

tdiaz · January 27, 2020, 8:52pm

By chance, just how much is ‘a lot’ with reference to running a local instance of the MB server?

I gave up on it after three days of seeing no progress once it started replicating. I wanted to take the VM and image it back to real hardware just so I could actually see if it was doing anything or not.

My desire for the local instance is along the lines of the reason’s you’ve given, though mainly b) with a focus towards learning more about how Picard interacts with MB as well.

InvisibleMan78 · January 27, 2020, 9:22pm

The more RAM the better. I would not start it with less then 8 GB.
If you want to run your own search server too, I would suggest at least 16 GB RAM.
(The mentioned 4 GB RAM and 2 Cores from the documentation are the technical minimum, don’t even try it.)

You can open a second terminal window and enter this command line:
sudo docker-compose exec musicbrainz /usr/bin/tail -f slave.log
Then you can see the processing of every single replication packet (produced every full hour).

BUT:
Don’t start to replicate from 2018-08-14, this will take forever.
Download and import a so called “Full Data Dump” - only some days old. Replicate then the missing packets until now.

nadl40 · January 28, 2020, 12:29pm

My experience in terms of resources is not too bad. I’m using lenovo thinkcentre i7 with 8GB and SSD for linux host, 4 cores.

By chance, just how much is ‘a lot’ with reference to running a local instance of the MB server?

VM is allocated at 4GB and 2 CPU’s, which gives you 4 threads.

This VM is highly outdated and you need some manual steps to get it up, running and actualized.

It did run 3 days… but it eventually finished. I had to do a schema upgrade for May 2019, I’ve found instructions in the forum. Index rebuild has run in hours.

Overall, this local instance is bit snappier than the master, replicates on schedule. Quite a lot of fun actually.

hiccup · February 3, 2020, 6:43pm

Thanks for the feedback and information everybody!
I can now be fairly certain that this would not be a project I would enjoy much, nor could bring to a happy end.

jesus2099 · November 18, 2020, 10:52am

Hello there,

I am really not a usual reader of this category (which is actually one of my muted categories - but I will remove it now!).

This topic title is exactly what I was looking for.
The topic answered @hiccup’s situation but not mine yet, and it seems I could get good advice here.

So I cherry-picked some of your questions and let me describe my situation:

I very rarely submit ultra minimalist code changes (PR) to MBS.
Usually it’s just some StyleSheet (CSS) changes, not even JavaScript.
Those two can be tested easily with user CSS and user scripts run directly on MusicBrainz.org in any browser.

But sometimes I would like to fix this or that server code bug that requires a running server to test if my change solves the issue or not (most specifically as I don’t know the language used in MBS so it’s very step by step try and adapt until it works).

I used to have a celes.mbsandbox.org for this kind of changes.
It was great as it allowed me testing my patches by simple git checkout to my branch, on my sand box.
But now I understand they won’t be brought back.

Oh, that is what I want indeed.
I don’t really care about the database data freshness.
From your linked page, however, it seems there is a well documented install procedure for development, that looks really easier to follow (Docker? / Compose?).

And, uh oh, this is my exact situation (4 GB RAM and 2 Cores).

So I have:

a 4 GB RAM and 2 Cores Linux (Debian 10 Xfce) main PC,
a Raspberry Pi 3B and a 3B+ with Raspbian (Raspberry Pi OS) 10, and also
an office laptop under Windows 10.

I would like to use one of my home computers rather than the office laptop (even if I know Windows better than Linux too).

Installing Docker (and Compose?) on Debian seems little more complex than normal package install.
It seems possible on Raspbian, with a helper script of theirs.

At this point, I forget about Windows. It seems I would prefer trying this stuff on the RPi, to not pollute the real computers.

Do you think that my memory and CPU config and no Docker knowledge is really a blocker?

InvisibleMan78 · November 18, 2020, 12:15pm

To be honest: If you don’t want to invest hours of reading online documentations and willing to spend even more hours for a very steep learning curve, the answer is: Yes, this combination with your hardware is (almost) a blocker.

A side note:

That’s the point where you should start using some form of virtualization (VMware, VirtualBox, Hyper-V…)

jesus2099 · November 18, 2020, 1:54pm

I thought using Docker was about doing that.

I remember once trying virtualisation (I don’t remember which program but not Docker) under Windows XP with that same computer, it was slow as when you look at a video image per image, almost not usable.

But as my PC is not really that slow.
So it was maybe because of Windows XP (I tried 10 it’s faster) or because of the virtualisation program, or because of what I tried to do in there (Impulse Tracker 2).

I have an Intel Core 2 Duo E6300 as I assembled my PC around 2006.

InvisibleMan78 · November 18, 2020, 2:42pm

We get off-topic, just a short copy & paste (replace “VMware” with other virtualization software then Docker)

VMware emulates machine hardware whereas Docker emulates the operating system in which your application runs. Docker is a much more lightweight virtualization technology since it does not have to emulate server hardware resources. The focus is on abstracting the environment required by the app, rather than the physical server. VMware, just like actual machine hardware, lets you install operating systems and other tasks that require a full server.

The speed highly depends on your CPU und RAM. The more the faster. Your machine from 2006 is really old and not suited to run any form of virtualization, not Docker or anything else. Sorry.

jesus2099 · November 18, 2020, 9:35pm

Challenge accepted!
My 2006 PC has always been able to do anything I wanted.
I’ve read somewhere that Docker was able to run on Raspberry Pi 2B.
So I may have 3 machines able to run MBS.
I’ll report here.

IvanDobsky · November 19, 2020, 12:03am

I also run ancient kit, but a big tip is go look on EBay or similar - read up what your motherboard can take, then max it out via EBay for cheap. You’ll find a Quad core on there for peanuts, and that extra 4GB of RAM. You can keep that steam power kit, but at least this way you can max it out in an way you could never afford in 2006.

dns_server · November 19, 2020, 2:53am

Docker is relatively easy to use and understand but there is a learning curve but you should be able to work things out.
Installing docker on debian / ubuntu is straight forward, there are older packages in the normal os archive but you should go to docker.com and use the docker provided apt sources to get a more up to date docker.

Docker runs fine on windows especially once you have windows build 2004 (or an up to date older build) and have windows subsystem for linux version 2 installed.
This offers great performance without the overhead of an emulator * so it should run with similar performance to running docker on linux.
Follow the instructions at the following to install wsl2 then install ubuntu (recommended but not required) and docker desktop.

I would not use the raspberry pi to run docker and musicbrainz.
The docker containers are built for x86_64 so you need to recompile everything to get this running on a raspberry pi.
The way the docker containers have been built there are lots and lots of layers so you need to recompile all these layers to get a running container. I have done this myself and had it running on an arm based nas.

* Docker on windows does use parts of the emulator for virtual IO to store a linux file system image but there is a linux kernel running as a binary under windows.

jesus2099 · December 3, 2020, 8:02pm

Wow thanks very much @dns_server for this huge post, there is so much we can learn everyday!
Thanks too @InvisibleMan78, but I don’t really understand why Docker wouldn’t work on my machine (I don’t need hardware emulation like VMWare).
I have to see it working or not, to make sure.

Wow I did not know that Linux for Windows thing!
Oh but my office PC does not meet the build requirement for version 2.
And I don’t have admin rights anyway. I shouldn’t install any funny stuff.

Oh that’s definitely a no go for me, it sounds terrible.

So I will stay on my Debian and install Docker then install Docker Compose on my Debian PC then install the development MusicBrainz server.

I preferred Raspberry Pi because I can reflash its SD when I want.
I was not sure I would know how to completely uninstall this stuff from Debian, as it’s not simple packages, if it doesn’t work.