How to import edit/editor database dumps into a replicated database on musicbrainz-docker?

How can I import the supplemental MusicBrainz data dumps in mbdump-edit.tar.bz2 and mbdump-editor.tar.bz2 into a replicated MusicBrainz database and server on the musicbrainz-docker compose stack? I am trying to populate the editor and edit* tables, which are not provided through replication.

I had no trouble following the MusicBrainz Database / Download instructions. I figured out how to give the musicbrainz-1 container access to the dump files on my host file system. I think the Install.md instructions from the musicbrainz-server repo are trying to point me in the right direction.

Install.md suggests running this command to make a new database from a complete data dump:

./admin/InitDb.pl --createdb --import /tmp/dumps/mbdump*.tar.bz2 --echo

Combining that with the musicbrainz-docker invocations, I think I might be able to import the extra data I want with a command from my host’s shell, like:

% docker compose run --rm musicbrainz admin/InitDb.pl --import --table editor /fromhost/fullexport/mbdump-editor.tar.bz2 --echo

but this returns the following error message:

Can't locate aliased.pm in @INC (you may need to install the aliased module) (@INC entries checked: /musicbrainz-server/admin/../lib /root/perl5/lib/perl5/5.38.2/x86_64-linux-gnu-thread-multi /root/perl5/lib/perl5/5.38.2 /root/perl5/lib/perl5/x86_64-linux-gnu-thread-multi /root/perl5/lib/perl5 /usr/local/lib/perl5/site_perl/5.38.2/x86_64-linux-gnu-thread-multi /usr/local/lib/perl5/site_perl/5.38.2 /usr/local/lib/perl5/vendor_perl/5.38.2/x86_64-linux-gnu-thread-multi /usr/local/lib/perl5/vendor_perl/5.38.2 /usr/local/lib/perl5/5.38.2/x86_64-linux-gnu-thread-multi /usr/local/lib/perl5/5.38.2) at /musicbrainz-server/admin/../lib/MusicBrainz/Server/DatabaseConnectionFactory.pm line 6.
BEGIN failed--compilation aborted at /musicbrainz-server/admin/../lib/MusicBrainz/Server/DatabaseConnectionFactory.pm line 6.
Compilation failed in require at /musicbrainz-server/admin/../lib/DBDefs.pm line 31.
BEGIN failed--compilation aborted at /musicbrainz-server/admin/../lib/DBDefs.pm line 31.
Compilation failed in require at /musicbrainz-server/admin/InitDb.pl line 10.
BEGIN failed--compilation aborted at /musicbrainz-server/admin/InitDb.pl line 10.

This says to me that docker compose run is not arriving at a properly set up Perl context on the musicbrainz-1 container.

When I try to run the same admin/InitDb.pl command from a shell within musicbrainz-1 via Docker Desktop, I get a similar Perl error. Interestingly, the module it can’t find is String/ShellQuote.pm instead.

So: how can I import the tables from the edit/editor database dumps into my replicated database on musicbrainz-docker?

We use Carton to manage dependencies, which requires invoking Perl with carton exec. I don’t have musicbrainz-docker set up on my new laptop, so was unable to test this, but would try something like docker compose run --rm musicbrainz bash -c 'carton exec -- admin/InitDb.pl [...]'. That should resolve the Perl errors at least.

Thank you for the tip. @Bitmap ! It worked.

What seemed to work for me were these commands:

% docker compose run --rm musicbrainz bash -c 'carton exec -- admin/MBImport.pl /fromhost/fullexport/mbdump-edit.tar.bz2' 
[... lots of output omitted ...]
Loaded 17 tables (1000676759 rows) in 7287 seconds

% docker compose run --rm musicbrainz bash -c 'carton exec -- admin/MBImport.pl /fromhost/fullexport/mbdump-editor.tar.bz2'
[... lots of output omitted ...]
Loaded 0 tables (0 rows) in 50 seconds

(The contents of the mbdump-editor.tar.bz2 seem to have non-standard behaviour; the output mentions ā€œmbdump/editor_sanitisedā€, and claims ā€œNo data file found for ā€˜editor’, skippingā€, but the result is an editor table, with names and membership dates etc. scrubbed clean of identifying information.

I seem to have editor and edit* tables which I can use, now. Thank you!

1 Like

Ah, not so fast!

Shortly after I imported these tables, I noticed that my replication was broken. As discussed in the topic, Replication error: ā€œcurrent row … contains a different value … than the replication packet suggests it should haveā€, it seems that:

That meant that the replication jobs failed.

Apparently, it is advisable to give the --noupdate-replication-control to MBImport.pl when importing editor and edit* tables, like so:

% docker compose run --rm musicbrainz bash -c 'carton exec -- 
admin/MBImport.pl --noupdate-replication-control /.../mbdump-edit.tar.bz2' 

(Line breaks added for legibility.)

That topic also had a way to repair the replication_control table:

That worked for me. I changed $ignore_conflicts to 1. Then I ran a few replication cycles manually, which despite intermittent failures caught me part-way up. I restored $ignore_conflicts to 0. Further manual replication cycles continued to work.

1 Like