I would like to have such call:
pv -ptebar compressed.csv.gz | python my_script.py
Inside my_script.py
I would like to decompress compressed.csv.gz
and parse it using Python csv parser. I would expect something like this:
import csv
import gzip
import sys
with gzip.open(fileobj=sys.stdin, mode='rt') as f:
reader = csv.reader(f)
print(next(reader))
print(next(reader))
print(next(reader))
Of course it doesn't work because gzip.open
doesn't have fileobj
argument. Could you provide some working example solving this issue?
UPDATE
Traceback (most recent call last):
File "my_script.py", line 8, in <module>
print(next(reader))
File "/usr/lib/python3.5/gzip.py", line 287, in read1
return self._buffer.read1(size)
File "/usr/lib/python3.5/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 404, in _read_gzip_header
magic = self._fp.read(2)
File "/usr/lib/python3.5/gzip.py", line 91, in read
self.file.read(size-self._length+read)
File "/usr/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
The traceback above appeared after applying @Rawing advice.
In python 3.3+, you can pass a file object to gzip.open
:
The filename argument can be an actual filename (a str or bytes object), or an existing file object to read from or write to.
So your code should work if you just omit the fileobj=
:
with gzip.open(sys.stdin, mode='rt') as f:
Or, a slightly more efficient solution:
with gzip.open(sys.stdin.buffer, mode='rb') as f:
If for some odd reason you're using a python older than 3.3, you can directly invoke the gzip.GzipFile
constructor . However, these old versions of the gzip
module didn't have support for files opened in text mode, so we'll use sys.stdin
's underlying buffer instead:
with gzip.GzipFile(fileobj=sys.stdin.buffer) as f:
使用gzip.open(sys.stdin.buffer, 'rt')
修复了Python 3的问题。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.