[SOLVED] Import of dumps currently not possible - ERROR 404: Not Found

After the data dump files have been downloaded (all OK), the process starts “Fetching database dump…” and currently ends with this message:

Saving to: ‘/media/dbdump/index.html’

index.html                               [ <=>                                                                ]     634  --.-KB/s    in 0s      

2025-05-06 10:44:43 (476 MB/s) - ‘/media/dbdump/index.html’ saved [634]

<html>
<head><title>Index of /pub/musicbrainz/data/fullexport/</title></head>
<body>
<h1>Index of /pub/musicbrainz/data/fullexport/</h1><hr><pre><a href="../">../</a>
<a href="20250430-001856/">20250430-001856/</a>                                   30-Apr-2025 03:28       -
<a href="20250503-001837/">20250503-001837/</a>                                   03-May-2025 03:25       -
<a href="LATEST">LATEST</a>                                             03-May-2025 03:25      16
<a href="latest-is-20250503-001837">latest-is-20250503-001837</a>                          03-May-2025 03:25       0
</pre><hr></body>
</html>
--2025-05-06 10:44:43--  https://data.metabrainz.org/pub/musicbrainz/data/fullexport//MD5SUMS
Resolving data.metabrainz.org (data.metabrainz.org)... 138.201.203.43
Connecting to data.metabrainz.org (data.metabrainz.org)|138.201.203.43|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2025-05-06 10:44:44 ERROR 404: Not Found.

Is any file missing from the data dump servers?

Hi, I checked the FTP server and didn’t find any missing files, nor any errors generating the recent dumps. @reosarevok also tested running the fetch-dump.sh script (with both and replica) and didn’t encounter any issues.

The URL it’s fetching for you is broken, because it’s missing the timestamp between the double slash (“fullexport//MD5SUMS”). Not sure why that is. Can you still replicate the error?

1 Like

Yep, still the same using fetch-dump.sh with both

index.html                                 [ <=>                                                                        ]     634  --.-KB/s    in 0s      

2025-05-06 16:00:52 (424 MB/s) - ‘/media/dbdump/index.html’ saved [634]

<html>
<head><title>Index of /pub/musicbrainz/data/fullexport/</title></head>
<body>
<h1>Index of /pub/musicbrainz/data/fullexport/</h1><hr><pre><a href="../">../</a>
<a href="20250430-001856/">20250430-001856/</a>                                   30-Apr-2025 03:28       -
<a href="20250503-001837/">20250503-001837/</a>                                   03-May-2025 03:25       -
<a href="LATEST">LATEST</a>                                             03-May-2025 03:25      16
<a href="latest-is-20250503-001837">latest-is-20250503-001837</a>                          03-May-2025 03:25       0
</pre><hr></body>
</html>
--2025-05-06 16:00:52--  https://data.metabrainz.org/pub/musicbrainz/data/fullexport//MD5SUMS
Resolving data.metabrainz.org (data.metabrainz.org)... 138.201.203.43
Connecting to data.metabrainz.org (data.metabrainz.org)|138.201.203.43|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2025-05-06 16:00:53 ERROR 404: Not Found.

When I use fetch-dump.sh replica it seems to work.
Please have a look at the detail that the content of index (for LATEST) is no more HTML code…

2025-05-06 16:04:44--  https://data.metabrainz.org/pub/musicbrainz/data/fullexport/LATEST
Resolving data.metabrainz.org (data.metabrainz.org)... 138.201.203.43
Connecting to data.metabrainz.org (data.metabrainz.org)|138.201.203.43|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16 [application/octet-stream]
Saving to: ‘/media/dbdump/LATEST’

LATEST                        100%[===============================================>]      16  --.-KB/s    in 0s      

2025-05-06 16:04:44 (12.9 MB/s) - ‘/media/dbdump/LATEST’ saved [16/16]

--2025-05-06 16:04:44--  https://data.metabrainz.org/pub/musicbrainz/data/fullexport/20250503-001837/MD5SUMS
Resolving data.metabrainz.org (data.metabrainz.org)... 138.201.203.43
Connecting to data.metabrainz.org (data.metabrainz.org)|138.201.203.43|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 583 [application/octet-stream]
Saving to: ‘/media/dbdump/MD5SUMS’

MD5SUMS                       100%[===============================================>]     583  --.-KB/s    in 0s      

2025-05-06 16:04:45 (768 MB/s) - ‘/media/dbdump/MD5SUMS’ saved [583/583]

--2025-05-06 16:04:45--  https://data.metabrainz.org/pub/musicbrainz/data/fullexport/20250503-001837/mbdump.tar.bz2
Resolving data.metabrainz.org (data.metabrainz.org)... 138.201.203.43
Connecting to data.metabrainz.org (data.metabrainz.org)|138.201.203.43|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6124459430 (5.7G) [application/octet-stream]
Saving to: ‘/media/dbdump/mbdump.tar.bz2’

mbdump.tar.bz2                 46%[=====================>                          ]   2.68G  67.3MB/s    eta 40s

Sorry, I think when reosarevok tested it the script was aborted before it got to the DB dumps (which is where the issue happens). I was able to reproduce it now.

It seems there were some local changes to our search-indexes-dump container that caused an extra dump to be produced today. (Normally they’re only produced on Wed. and Sat.). But there isn’t a corresponding database dump for this date. cc @yvanzo

For now I’ve worked around this by just updating the LATEST and latest-* files in FTP to point to the previous search indexes dump:

root@67b24245cccc:/home/musicbrainz/musicbrainz-server# cd /home/musicbrainz/search-index-dumps/
root@67b24245cccc:/home/musicbrainz/search-index-dumps# ls
20250503-201302  20250506-041005  LATEST  latest-is-20250506-041005
root@67b24245cccc:/home/musicbrainz/search-index-dumps# mv latest-is-20250506-041005 latest-is-20250503-201302
root@67b24245cccc:/home/musicbrainz/search-index-dumps# cat LATEST
20250506-041005
root@67b24245cccc:/home/musicbrainz/search-index-dumps# echo 20250503-201302 > LATEST
root@67b24245cccc:/home/musicbrainz/search-index-dumps# cd ../musicbrainz-server/
root@67b24245cccc:/home/musicbrainz/musicbrainz-server# sudo -E -H -u musicbrainz sh -c 'MBS_ADMIN_CONFIG=config.search-indexes-dump.sh ./bin/rsync-fullexport-files'
~/musicbrainz-server/admin ~/musicbrainz-server
~/musicbrainz-server
Warning: Permanently added '[mbfullexport.local]:65415' (ED25519) to the list of known hosts.
sending incremental file list
./
LATEST

sent 2,371 bytes  received 46 bytes  4,834.00 bytes/sec
total size is 102,550,098,694  speedup is 42,428,671.37
Warning: Permanently added '[mbfullexport.local]:65415' (ED25519) to the list of known hosts.
sending incremental file list
latest-is-20250503-201302

sent 134 bytes  received 35 bytes  338.00 bytes/sec
total size is 0  speedup is 0.00

It should work for you now.

1 Like

Just to confirm:
Yes, it worked fine. Thanks for the quick fix!

1 Like