Earlier this year, I wrote a command-line program unimaginatively named soundalike that uses chromaprint’s fpcalc
utility to look for duplicate recordings within my music collection without talking to external services like AcoustID. It’s been working pretty well, so I figured I should finally get it into a state where other people can try it out.
The basic idea is that the program fingerprints all the audio files under a directory and then prints clusters of similar files:
% ./soundalike testdata
2022/09/26 16:03:23 Finished scanning 6 files
64/Fanfare for Space.mp3 0.47 MB 61.49 sec
orig/Fanfare for Space.mp3 2.35 MB 61.44 sec
64/Honey Bee.mp3 0.44 MB 57.16 sec
orig/Honey Bee.mp3 2.18 MB 57.10 sec
You can also do one-off comparisons between two files:
% soundalike -compare -compare-interval 100 instrumental.mp3 vocals.mp3
100: 0.983
200: 0.984
300: 0.983
400: 0.989
500: 0.819
600: 0.757
700: 0.751
800: 0.753
...
There are various flags for tweaking thresholds, ignoring false positives in later runs, etc.
If you’re interested in trying it out, I’ve uploaded precompiled binaries for Linux and Windows (x86-64) at https://github.com/derat/soundalike/releases/tag/v0.1. All of my usage so far has been on Linux, although I verified that it at least runs on Windows 10. I don’t know of any reason why it wouldn’t work on macOS, but there’s no precompiled version (yet?) since cross-compiling seems like a nightmare and I wouldn’t have a good way of testing it.
I’ve tried to document how to install and use the program in the README.md file. Note in particular that you’ll probably need to copy fpcalc
or fpcalc.exe
from a Chromaprint release to a location where soundalike
can find it.
And if you run into any issues or have suggestions, please feel free to create an issue!