简体   繁体   中英

How to open a json.gz.part file using Python?

I have lots of json.gz files in a directory and some them are json.gz.part. Supposedly, when saving them, some of the files were too large and they were splitted.

I tried to open them as normally using:

with gzip.open(file, 'r') as fin:
        json_bytes = fin.read()  
    json_str = json_bytes.decode('utf-8')            # 2. string (i.e. JSON)
    bb = json.loads(json_str)

But when it comes to the .gz.part files I get an error:

uncompress = self._decompressor.decompress(buf, size)

error: Error -3 while decompressing data: invalid code lengths set

I've tried the jiffyclub's solution, but I get the following error:

    _read_eof = gzip.GzipFile._read_eof

AttributeError: type object 'GzipFile' has no attribute '_read_eof'

EDIT:

If I read line by line I'm able to read most of the content file, until I get an error:

with gzip.open(file2,'r') as fin:        
        for line in fin: 
            print(line.decode('utf-8'))

After printing most of the content I get:

error: Error -3 while decompressing data: invalid code lengths set

But using this last method I cannot convert its content to a json file.

import gzip
import shutil

# open the .gz file
with gzip.open('file.gz.part', 'rb') as f_in:
    # open the decompressed file
    with open('file.part', 'wb') as f_out:
        # decompress the .gz file and write the decompressed data to the decompressed file
        shutil.copyfileobj(f_in, f_out)

# now you can open the decompressed file
with open('file.part', 'r') as f:
    # do something with the file
    contents = f.read()

This code will open the.gz.part file, decompress the data, and write the decompressed data to a new file called file.part. You can then open the file.part file and read its contents just like you would with any other text file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM