Another idea for identifying duplicates - Acoustid!!
Whatever the solution is we should try to make it fit several use cases.
This feels to me like a plugin waiting to be written - with some options:
a. Use track / recording MBIDs / AcoustIDs for dedup.
b. Merge album data from all files into file being kept - which probably implies that we need a function that avoids this data being lost if you update the tags.
c. Options for deciding which file to keep - largest, smallest, newest, oldest …