Availablity of 64-bit version of Picard on Windows

subi75 · September 23, 2017, 10:57am

Hey all, im a newbie here, and have been playing around with Picard (32bit on a Win10 x64 8 core machine) now for about a month, slowly working my way through a library of over 100,000 files (reading other forum posts, I know this is going to take months if not years to complete as im manually importing small groups of files at a time and manually modifying/correcting the metadata in the process and have only knocked over about 1000 files so far)…

Anyhow, I’m just curious, given the abundance of machines running x64 OS since Windows 7 (thinking primarily OEM installs) with multi-core processors. when will a true 64bit version of Picard will be made available, as I’ve noted on my AMD 8-core machine that Picard is only using 1 core (roughly 14% total CPU), thereby in itself limiting its processing speed and ability which in turn is slowing down the process of importing/modifying/correcting and saving the updated files (with the primary slowdowns being around importing, clustiering, doing lookups and/or scanning), especially when working with large volumes of files.

outsidecontext · September 23, 2017, 1:16pm

Can’t say right now if somebody had already looked into creating a 64 bit package of Picard for Windows, but my guess is that it would not really bring much of an performance improvement.

For Picard the bottleneck is not the CPU. The slow parts are the network requests and writing the files to disk when saving.

The network requests to MusicBrainz are rate limited, and the number of requests per album also depends on the plugins and options you’ve enabled.

Saving the files can be rather fast, when Picard only needs to rewrite the tag parts, but it can be slow when Picard needs to rewrite the entire files (especially if it is a large file like FLAC) and/or the files are stored on network storage.

Maybe if you can describe which steps are especially slow for you we can come up with improvements.

zag2me · July 13, 2018, 4:23am

Just noticed this myself processing 100,000 tracks for 2 days solid!

25% max CPU all the way through and very slow performance on the initial read of the files.

For me the bottleneck does appear to be the CPU, not issue with disk (SSD) as its all local.

outsidecontext · July 26, 2018, 10:03pm

This really depends what you mean with processing. The three primary things to do is using Scan, Lookup and Saving, and all should have different resource bottlenecks.

Scan (audio analysis) probably is mainly CPU bound (maybe disc reading also on slower drives). But Scan ist something you usually won’t do on too many tracks unless your existing tags are all totally crap and useless.

Lookup should be mostly network, which means in this case it is heavily limited by the MB request limit.

Saving is heavy on disc IO. And of course network if you save to network drives. The deciding factor in IO here is if just the file tags need to be updated or if the entire file needs to be rewritten. That depends on how much the tags grow and how much padding was left free in the files.

In the end loading 100k Songs at once is not a great idea for various reasons, but there are already plenty of threads about that.

InvisibleMan78 · July 26, 2018, 10:14pm

If this is true, should Picard’s local access to a local MB-VM (including DB and Search) running on a local SSD not be much faster? Comparing the same Picard version accessing the official MB on the internet doesn’t give you any noticable speed difference.

Could it be that the local MB-VM is also rate limited?

samj1912 · July 27, 2018, 5:14am

That’s not it, it’s Picard’s webservice module. The default minimum time between any request is 1 second.

See https://github.com/metabrainz/picard/blob/master/picard/webservice/ratecontrol.py

REQUEST_DELAY_MINIMUM

InvisibleMan78 · July 27, 2018, 6:43am

Thank you @samj1912. How can we configure (reduce? eliminate?) this limit for local hosts or server in the same local subnet in Picard?

MetaTunes · July 27, 2018, 7:27am

The rate-limiting is a constraint in accessing the info from MusicBrainz, which can only be overcome if you have your own copy of the database.
However, I agree that it is a pain! Particularly if you want all the data for classical releases - my Classical Extras plugin gets severely slowed down because each “work relations” lookup needs to be submitted separately. Ideally, MusicBrainz would allow multiple relations in one request. However, I suspect that would be a technical challenge. Another thought is to allow a more dynamic form of rate-limiting - e.g. 100ms for the first x requests in an hour, then 1sec thereafter.

outsidecontext · July 27, 2018, 8:53am

You can do this with a small plugin. I hinted at it at New test VM available - #15 by outsidecontext , but I think the syntax changed a bit (see samj1912 comment above).

InvisibleMan78 · July 27, 2018, 9:27am

Thank you for the hint @outsidecontext

Do you talk about this lines?

# Throttles requests to a given hostkey by assigning a minimum delay between
# requests in milliseconds.
#
# Plugins may assign limits to their associated service(s) like so:
#
# >>> from picard.webservice import REQUEST_DELAY_MINIMUM
# >>> REQUEST_DELAY_MINIMUM[('myservice.org', 80)] = 100  # 10 requests/second
REQUEST_DELAY_MINIMUM = defaultdict(lambda: 1000)

# Current delay (adaptive) between requests to a given hostkey.
REQUEST_DELAY = defaultdict(lambda: 1000)  # Conservative initial value.

Which syntax do I have to use? The example and the effective setting doesn’t match.
How can I decrease the REQUEST_DELAY and REQUEST_DELAY_MINIMUM for a specific local host at 192.168.1.234 to the max? Or can I even eliminate this delays locally?

Or do we talk about this lines:

# Plugins may assign limits to their associated service(s) like so:
#
# >>> from picard.webservice import ratecontrol
# >>> ratecontrol.set_minimum_delay(('myservice.org', 80), 100)  # 10 requests/second


# Minimun delay for the given hostkey (in milliseconds), can be set using
# set_minimum_delay()
REQUEST_DELAY_MINIMUM = defaultdict(lambda: 1000)

# Current delay (adaptive) between requests to a given hostkey.
REQUEST_DELAY = defaultdict(lambda: 1000)  # Conservative initial value.

# Determines delay during exponential backoff phase.
REQUEST_DELAY_EXPONENT = defaultdict(lambda: 0)

MetaTunes · July 27, 2018, 9:44am

Surely this is only permissible if you have MB on a VM? Not sure the OP has this. Otherwise hacking Picard will result in potential lock-outs.

samj1912 · July 27, 2018, 10:10am

Use the function set_minimum_delay()

github.com

metabrainz/picard/blob/master/picard/webservice/ratecontrol.py#L76


#
# After placing this many unacknowledged requests on the wire, switch from
# slow start to congestion avoidance.  (See `_adjust_throttle`.)  Initialized
# upon encountering a temporary error.
CONGESTION_SSTHRESH = defaultdict(lambda: 0)


# Storage of last request times per host key
LAST_REQUEST_TIMES = defaultdict(lambda: 0)




def set_minimum_delay(hostkey, delay_ms):
"""Set the minimun delay between requests
        hostkey is an unique key, for example (host, port)
        delay_ms is the delay in milliseconds
"""
REQUEST_DELAY_MINIMUM[hostkey] = delay_ms




def current_delay(hostkey):
"""Returns the current delay (adaptive) between requests for this hostkey
        hostkey is an unique key, for example (host, port)

outsidecontext · July 27, 2018, 1:24pm

Yes, true. That’s also one reason I never posted a ready to use example so far, because I was sure people would abuse it to get faster access to MB.org and shoot themselves in the foot when doing so

But my answer was specifically to @InvisibleMan78 who seems to have a local VM and asked about that. I think I will ignore my fears and post an example later.

zag2me · October 5, 2018, 9:38am

There appears to be a bug when saving loads of files at once. If I do a ctrl + a to select all tracks and hit save its incredibly slow.

If I do a ctrl + a that selects all tracks, click save, then click in the white space to deselect all tracks its incredibly quick to save. Its as if there is some GUI penalty when all the tracks are selected on save.