简体   繁体   中英

Python: Optimizing Images in Memory (StringIO & POpen with jpegoptim)

I'm trying to compress images without touching disk using the STDIN version of various libraries(jpegoptim in this example).

This code does not return an optimized(jpegoptim compressed) image.

Can someone please help or explain why this usage of Popen() with a StringIO.StringIO() object does not return the optimized version of the image? If I save the file to disk, it works just fine.

import sys
import urllib2 as urllib
import StringIO

from subprocess import Popen, PIPE, STDOUT
fp = urllib.urlopen('http://www.path.to/unoptimized.jpg')
out_im2 = StringIO.StringIO(fp.read()) # StringIO Image
print "Image Size: %s" % format(sys.getsizeof(out_im2.getvalue()))
subp = Popen(["/usr/bin/jpegoptim", "-"], shell=True, stdout=PIPE, stdin=PIPE, stderr=STDOUT)
image_str = subp.communicate(input=out_im2.getvalue())[0]
out_im2.write(image_str)

##This should be a different size if it worked! It's not
print "Compressed JPG: %s" % format(sys.getsizeof(out_im2.getvalue()))

It is because you are writing to the same input buffer. Create a new StringIO().

StringIO buffer expands to the size of the first uncompressed jpeg initially. Then you write over that buffer starting at 0 position with the new shorter string buffer, but it doesn't auto-truncate your buffer or anything. The StringIO buffer is still the same size and in fact all the trailing data will be left over junk from the original image.

In [1]: import StringIO

In [2]: out = StringIO.StringIO("abcdefg")

In [3]: out.getvalue()
Out[3]: 'abcdefg'

In [4]: out.write("123")

In [5]: out.getvalue()
Out[5]: '123defg'

There are several issues:

  1. The issue with incorrect overwriting of the StringIO() buffer pointed out by @doog abides
  2. Use len instead of sys.getsizeof() . The latter returns the size of internal representation in memory that is not equal to the number of bytes in the bytestring

  3. Don't use a list argument and shell=True together

You can pass the socket as stdin to the subprocess on some systems:

import socket
from urllib2 import urlopen
from subprocess import check_output

saved = socket._fileobject.default_bufsize
socket._fileobject.default_bufsize = 0  # hack to disable buffering
try:
    fp = urlopen('http://www.path.to/unoptimized.jpg')
finally:
    socket._fileobject.default_bufsize = saved # restore back

# urlopen() has read http headers; subprocess can read the body now
image_bytes = check_output(["/usr/bin/jpegoptim", "-"], stdin=fp) 
fp.close()

# use `image_bytes` bytestring here..

stderr is not set to avoid hiding errors.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM