简体   繁体   English

使用 Python 将多个 .gz 文件解压缩为单个文本文件

[英]Unzipping multiple .gz files into single text file using Python

I am trying to unzip multiple.gz extentions files into single.txt file.我正在尝试将 multiple.gz 扩展文件解压缩到 single.txt 文件中。 All these files have json data.所有这些文件都有 json 条数据。

I tried the following code:我尝试了以下代码:

from glob import glob
import gzip

for fname in glob('.../2020-04/*gz'):
    with gzip.open(fname, 'rb') as f_in:
     with open('.../datafiles/202004_twitter/decompressed.txt', 'wb') as f_out:
        shutil.copyfileobj(f_in, f_out)

But the decompressed.txt file only has the last.gz file's data.但是 decompressed.txt 文件只有 last.gz 文件的数据。

Just shuffle f_out to the outside, so you open it before iterating over the input files and keep that one handle open.只需将f_out移到外面,这样您就可以在遍历输入文件之前打开它并保持该句柄打开。

from glob import glob
import gzip

with open('.../datafiles/202004_twitter/decompressed.txt', 'wb') as f_out:
    for fname in glob('.../2020-04/*gz'):
        with gzip.open(fname, 'rb') as f_in:
            shutil.copyfileobj(f_in, f_out)

Use "wba" mode instead.请改用"wba"模式。 a opens in append mode. a以 append 模式打开。 w alone will erase the file upon opening. w单独将在打开时删除文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM