简体   繁体   中英

Unzipping multiple .gz files into single text file using Python

I am trying to unzip multiple.gz extentions files into single.txt file. All these files have json data.

I tried the following code:

from glob import glob
import gzip

for fname in glob('.../2020-04/*gz'):
    with gzip.open(fname, 'rb') as f_in:
     with open('.../datafiles/202004_twitter/decompressed.txt', 'wb') as f_out:
        shutil.copyfileobj(f_in, f_out)

But the decompressed.txt file only has the last.gz file's data.

Just shuffle f_out to the outside, so you open it before iterating over the input files and keep that one handle open.

from glob import glob
import gzip

with open('.../datafiles/202004_twitter/decompressed.txt', 'wb') as f_out:
    for fname in glob('.../2020-04/*gz'):
        with gzip.open(fname, 'rb') as f_in:
            shutil.copyfileobj(f_in, f_out)

Use "wba" mode instead. a opens in append mode. w alone will erase the file upon opening.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM