简体   繁体   中英

gzip with file descriptor

I need to be able to open a gzipped CSV file using a file descriptor. When I do the following (this is simplified code that still trigger the error):

file_path = r"E39O6KS6J8MIZW.00.0f6db4e5.gz"
fd = os.open(file_path, os.O_RDONLY, os.O_BINARY)
# ...
file_object = gzip.GzipFile(fileobj=io.FileIO(fd, mode='rb'))
reader = csv.reader(io.BufferedReader(file_object))
for row in reader:
    print row

I get a CRC check error:

Traceback (most recent call last):
  File "./log_processing_scripts/dev.py", line 40, in <module>
    post()
  File "./log_processing_scripts/dev.py", line 36, in post
    for row in reader:
  File "C:\Python27\lib\gzip.py", line 252, in read
    self._read(readsize)
  File "C:\Python27\lib\gzip.py", line 299, in _read
    self._read_eof()
  File "C:\Python27\lib\gzip.py", line 338, in _read_eof
    hex(self.crc)))
IOError: CRC check failed 0x4b77635f != 0xbe13716L

The file is not(!) corrupted, I can process it just file with gzip.open(file_path).

What am I missing?

The O_RDONLY and O_BINARY flags should be or'd together, like so:

fd = os.open(file_path, os.O_RDONLY | os.O_BINARY)

Reference: https://docs.python.org/2/library/os.html#open-constants

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM