I think Picard could be extended in two ways to allows for music deduplication:
- Using a combination of AcoustIDs, MBIDs.
- Using a simple check if the destination (when the move option is enabled) already has a file of the same name. This situation currently causes Picard to transparently rename the file with a numerical suffix.
I propose that we allow a deduplicate action which:
- Asks for the root directory on which to traverse.
- Iterates over all files (maybe skip files without MBIDs to improve speed - will need to actually test a prototype) and creates a sqlite DB (or simple plaintext csv) with the filename, path, AcoustID and MBID.
- Now we can perform deduplication using a scoring algorithm (will need to design this and write a spec):
- High bitrate
- More tags compared to other candidates etc.
As for the second case, the fix is very simple by providing a option in the move files page to either “warn” or “silently continue” when such a case arises.
I’d like to discuss this idea further and flesh it out into a proper detailed proposal.