Possible file corruption of good files by Picard (MacOS)

scotia · June 2, 2020, 2:20am

Hi all,

this is anecdotal; I’ve yet to look into it properly. I just putting it out there.

On a few occasions recently I’ve tagged some otherwise good MP3 files and the resultant tagged file has been corrupted - by which I mean on playback the file has artefacts (pops, etc.). I’ve tested both the original file and the resultant file on multiple players to confirm their quality.

My media is not local to my MacBook, so Picard loads from and saves to AFP shares over WiFi. I’ve only noticed this over WiFi; when LAN connected the files are ok. I’ve not tested SAMBA shares. If I load the original file and save again when LAN connected (WiFi disabled) the artefacts are no longer present.

My WiFi is patchy in certain parts of the house (as exhibited by LONG save times of large files). But Picard doesn’t every complain nor throw an error.

Although I seem to be blaming WiFi I’m not yet sure that’s the problem. I’m wondering if anyone else has experienced issues like this.

Thanks

carV3 · June 2, 2020, 4:49am

You have answered you own question here and as suspected the problem is with the connection. Ideally you should work on the files locally and then send them via a wired connection or better still transfer via a portable drive.

Deleted_Editor_2136044 · June 2, 2020, 5:09am

Technically, the problem can be mitigated by checking if the file written matches what was supposed to be written. If not, try rewriting once or twice before failing.

scotia · June 2, 2020, 6:11am

Well despite the unreliability of WiFi there are multiple layers on top of that (including it’s own error checking and retrying) should preclude on-disk errors. I can’t recall writing anything else out to the server that upon a re-load is found to be corrupted (documents, etc.).

It’s AFP over TCP, so even if the WiFi is dodgy TCP will clean that up; or carp - in which case it should pass it up the stack and notify AFP that the share has failed. That MAY be happening and macOS is re-establishing the share in the background. Again that would affect all apps. Perhaps those apps (thinking Word, Excel) verify what’s written with a checksum (which means re-reading the entire from from the share which takes time).

As to @gabrielcarvfer’s point - that’s what checksumming would allow for. Does Picard do this (re-read to verify)? I’m not sure it’s warranted but perhaps a check-box for the paranoid among us would be nice.

I might raise a ticket - although I’m loathe to create yet another login account for a MusicBrainz site.

outsidecontext · June 2, 2020, 6:31am

If it is only caused on Wifi this might well be an issue with increased file size, and hence more load on the Wifi when loading those files. Are you embedding the cover images into tags? Espcially if you use the images in original size from Cover Art Archive they can be quite large. I would recommend to use either only store cover art as separate files or to limit the image sizes. For Cover Art Archive you can set it to only use 500px or 1200px sizes.

carV3 · June 2, 2020, 9:55am

Your quite right there is packet inspection and they should be getting checked and if there is an error another be sent. But corrupt downloads can and do happen so it comes down to the error correction rate along with packet size and transmission rate. Something else to consider is that Picard does not transcode or re encode the audio and only adds the tags so it should not affect the audio. If there is an issue during duplication and / or transfer of the file then it should not even open and play, it’s corrupted, a corrupt file wont play. I’m confused as to why you don’t get the issue if over Ethernet but do over Wireless but surly there is a clue there and then there’s the data rate / bitrate to your media player.

@outsidecontext could be onto something there with the artwork size adding to the total filesize which will then affect data rate / bitrate transmission.

Deleted_Editor_2136044 · June 2, 2020, 12:31pm

The AFP over TCP: TCP error detection/correction has limits, so depending on your SNR, things can get pretty bad even with wired connections.

I was thinking on checksum, but the problem is that if you use the downloaded file to calculate the checksum, the file could already be corrupted and checksumming it would forward the error.

@carV3 is completely correct on the network side.

Messing up a few bits/bytes shouldn’t prevent the file from running (unless the header is corrupted, which is improbable due to the size difference between it and the data). It can definitely affect the audio, especially in compressed encoding. PCM/raw audio corruption should be less noticeable.

Edit: looks like, at least on SMB, a sha checksum can be used to guarantee integrity of the files if the client is configured to require the signature of exchanged messages (HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanWorkstation\Parameters->RequireSecuritySignature=1). https://blogs.technet.microsoft.com/josebda/2010/12/01/the-basics-of-smb-signing-covering-both-smb1-and-smb2/

For samba (SMB for Linux/Mac), the equivalent setting in smb.conf seems to be client signing = mandatory . https://www.samba.org/samba/docs/current/man-html/smb.conf.5.html

NFS also has a similar setting, but requires both server and client to be properly configured with sec=krb5i option. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/storage_administration_guide/s3-nfs-security-hosts-nfsv4

Couldn’t find anything similar for AFP, and since Apple deprecated it, I don’t think they will fix that.

scotia · June 21, 2020, 5:32pm

A way to verify if the audio is correct as per a saved log file (I use XLD) is:

ls *.flac | while read f ; do grep -q `ffmpeg -loglevel error -nostdin -i "$f" -vn -acodec pcm_s16le -f s16le - | rhash -p '%C' -C -` *.log || echo "Failed CRC: $f" ; done

You need ffmpeg and rhash (or some other way to calculate a CRC32).

Deleted_Editor_2136044 · June 21, 2020, 6:27pm

Md5/sha are probably better suited for that. CRCs with such a small polynomial are indicated for small payloads (that’s exactly why they’re usually used to make sure at least file/packet headers are fine).

scotia · June 22, 2020, 12:58am

I agree entirely, but XLD only saves CRC32 hashes in its log file, so that’s all with which one can compare. I just ran that script (slightly modified) against 6,500 ripped tracks and found around 10 that I believe were corrupted by Picard. I re-ripped and re-saved with Picard and they were ok second time around.