I’d run that on matched files only. You need the MBIDs, so the user should do the identification first.
The alternative would be to allow calculation always, but have separate submission once files have been matched. That word be similar to AcoustID “Scan” behavior. But the difference is that “Scan” provides value in itself, where a pure AcousticBrainz calculation probably does not. So I’d do the analysis and submission in one step.
Remembering already submitted files is useful, as feature extraction is very slow. The current AB submission utility remembers files already submitted. I think it does so by path. I wonder if we can do better.
What about storing a touple of recording ID / AcoustID fingerprint / length? The AcoustID fingerprint is comparable cheap to calculate with fpcalc. Maybe also some information on the audio codec would be needed, not sure.
I’d probably store the submission info in a dedicated file.