Cover Art: How to reduce PNG file size in GIMP?

You mean 1200 dpi + png? That file is going to be ridiculously large, somewhere in the 200 MB scale. And again, if you really really insist on spending that much space, 2400 dpi (with a recent scanner) + jpeg would still be the better option. I hope you got my point.

2 Likes

I get your point, and since space is not an issue but human time is, I would still really appreciate the largest size that a user is scanning in, uploaded before being compressed to lossy/downsized to meet the users harddrive needs.

I found that with my own scanner, scanning at 1200 dpi didn’t produce any more detail than scanning at 600, but it did take ages. So it might be more of a placebo thing depending on your scanner.

1 Like

That’s because for the vast majority of the cover scans, with 600 dpi you are already well beyond the resolution of the source (which is good). Thus you are already getting “everything” at 600 dpi.

That’s also the reason why hq jpeg compression doesn’t do any harm. With high enough resolution, there are not sharp edges or contrasts on pixel-level, the image is “smooth”. The resulting jpeg artifacts are on the same level as the natural noise of the analog-digital conversion of the scanner.

2 Likes

IMO, it is.
Why having tons of useless megabytes in datacentres consuming gigawatts and producing nuclear waste?
As long as we can read credits and recognise the edition, the size can stay as small as 200 DPI jpeg.

2 Likes

It’s a cover art archive. I want to archive cover art because the art is important to me, and I want to save it for the future, with all its details.
You only care about the art as a means to access metadata which is fine, but not the only point of the archive, and not why I upload images.

1 Like

I agree with aerozol that the aim should be to archive the cover art, which means that the resolution should be high enough to capture all the details, not only the mere textual information.

And I agree with jesus2099 that space is an issue, even though there is no apparent hard limit. As with all resources, a diligent and respectful appoach should be taken.

Accepting these conditions, high resolution + hq JPEG compression is the way to go. JPEG artifacts are not an issue when the resolution is high enough! PNG format certainly has its right to exist, but in these circumstances it’s just a waste of resources.

3 Likes

To bring some nuance to the jpg vs png consideration, in defence of png, I believe png could be justified (only) when the source image contains large areas of very subtle gradients. (e.g. skies)
With jpg compression settings to ‘high’, image details will not be an issue at all, but with a keen photographic eye, you might be able to spot some slight banding effects in gradient parts of an image.
But chances that you will run into such subtle gradients in cover-art are extremely slim, and even then, probably nobody would ever notice it.

I’ve really struggled to find information on how much energy (or nuclear waste?) is used by something like 100mb readily accessible in the ‘cloud’.
If anyone has any links I’d be grateful

As one that use to layout, spec, maintain, etc, Petabytes of parallel filesystems and the underlying storage ,100MB is smaller than a pinhead. I started writing a dissertation on what goes into a powering and cooling a computing complex but deleted it all. I am not that worried about the environmental costs, but they do add up. The inefficient wall-warts that power the networks in many homes probably draw more power.

I have the slight suspicion that you are assuming that CAA contains only a single file.

No, as I said it all adds up. There are a calculations that go into putting in new storage, just a few are floor space, power, cooling, raid level. Your comment made me think your worry was “green” in nature, not wasted money on equipment. Environmentally you have to take the usable storage size of the configured unit and divided it by the 100MB, then take that number and divide it into the power & cooling budget of the unit to give you the cost of that 100MB, its small, but adds up. Power is relatively easy to install (and consume), cooling all that wattage is another story and that does take energy. I am not trying to minimize the power/cooling consumption of energy issue, it is real. We are way off topic so I will end.

I believe the standard resolution for commercial printed documents is 300dpi.
This quality is at least what magazines use so I would expect cover art to use the same printers.

When scanning you want to scan at a higher resolution than the source as the dots in the document.
If you use the same resolution the dots will not always align so you will be capturing parts of pixels so it will not properly capture everything.

If you scan at 1200dpi you are not adding much more quality and will capture several of the same pixel next to each other. This would give you more information to down sample this so the algorithm that averages out the surrounding pixels has more data to play with but you may also be introducing more noise.

Speaking about file size, could we ask archive.org to provide us an api to convert these images?
Personally I would send them a big png (or other lossless format) image so they have an archival quality version. If they can then take this image and generate a smaller jpeg or png in different sizes for you to write to your music files.

2 Likes

Isn’t that what the Cover Art Archive already does?

4 Likes

Since nobody’s come back with any information at all on the power/waste creation of cloud storage files I’m going to go with what I could find online, which indicated that the power use of an average server is not nothing, but also nothing that will keep me up at night.

If you’re interested in archiving images for others to access online I will stick with my original request, for images 600dpi or higher, preferably in a lossless format. This is what the heritage team at work considers archival quality, and what I find to be useful when I need images for graphic work.

2 Likes

http://www.ethicalconsumer.org/ethicalreports/internetreport/ecofootprint.aspx




Most of the energy used (wasted) in data centres is used for air conditioning. The Internet Archive uses a storage system that runs without air conditioning at 3 kW per petabyte. According to the same page, the Internet Archive uses 50 petabytes of storage, which puts the total energy use at 150 kW.

One wind turbine rated for 1.5 MW (apparently that is the bottom end of the market nowadays) and an average capacity of 26.9% in the US (where the Archive is located), gives us an average production of 403,5 kW. So the Internet Archive can grow quite a bit before it will need more power than one wind turbine can deliver.

And at any rate, I’d rather spend that electricity for archiving our heritage than for liking something on Facebook.

4 Likes

Thank you @mfmeulenbelt, what I got was also that, to use one example, that it would take four cyclists to power one modern server (used as an example of inefficiency in an article). Not particularly compelling.

The articles that @jesus2099 links have no power consumption specifics. And from one of them:
"While most media and public attention focuses on the largest data centers that power so-called cloud computing operations – companies that provide web-based and Internet services to consumers and businesses – these hyper-scale cloud computing data centers represent only a small fraction of data center energy consumption in the United States."
Since I don’t trust sensationalist news very much I don’t particularly believe that either, but there you go.

I’m totally open to having my mind changed if anyone can give me something concrete, until then I recommend making your next meal vegan or walking/biking to work as a much more effective way of saving the environment.

I think we should not limit to USA.
I don’t know where datacenters actually are.
What I know is that online archives are just growing, it’s not something that is disposed.
For video and audio streaming it’s most probably worse.
It’s growing and growing and has to stay powered all the time.

It’s mostly what I do.

1 Like

Unfortunately archiving masses of physical media also requires ongoing temperature, lighting and humidity control. Many archival institutes are moving to digital because of savings in these areas (though space + searchability improvements are probably the biggest factor). If the power of four cyclists can ‘replace’ a whole Library’s worth of physical media, I don’t think we are doing badly on the pollution factor. And it has been getting more efficient as time passes.

Streaming must cause a massive data load that I haven’t looked into. I would love to see a paper or study that has more comparisons and details if anyone knows of one.

:pray: :heart:

3 Likes