I am working on script that find releases without AcousticBrainz data but with link to Bandcamp (now there is 191k such releases), download release from Bandcamp streaming and submit to AB. It can submit to AB 4000 releases from one PC (bottleneck is AB client CPU consumption).
I think there is a things to discuss:
Legal aspect. Legality of downloading music from streaming, let’s say, questionable. But i don’t think it is a problem for me (Bandcamp seems to be good guys, and i live not in US which complicates to prosecute). Legal purity of AB data collection seems not in danger here.
Data quality. BC screaming is MP3 128kbit/s, not so lossless. Besides there can be (and i believe there is) wrong (or at least not totally right) links to Bandcamp in MB database. I carefully automatic collate tracklists from MB and BC. About 15% of downloaded releases script consider as not match with MB release, mostly falsely. But i’m sure among 191k links there are some that will falsely pass my filter.
Script also can submit acoustic fingerprints to AcoustID, but AcoustID seems to have some issue now (maybe provoked by mass data submit during the script test).
Some releases on Bandcamp can be «bought» for $0 and downloaded in lossless, but in harder to automatize. Maybe there is another similar sources of data, or another data that we can automatically get from this source.
P.S. Legal statement: I am not distribute music downloaded from Bandcamp streaming, not listen this music, and keep it on computer only few minutes