简体   繁体   中英

Read gz file in python

I am trying to read/extract the contents of the file train.gz

my code:

import gzip
with gzip.open('train.gz', 'rb') as f:
    file_content = f.read()

when i run:

print(file_content)

I get this error (on jupyter notebook):

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
/tmp/ipykernel_2392/4036593255.py in <module>
----> 1 print(file_content)

MemoryError: 

any suggestions?

I

MemoryError suggests that the file is too big for your runtime to process.

IIGC train.gz may be a training model and it may be that you must deal with this model as a single chunk. If so, your best solution is to find a bigger (more memory) machine.

If at all possible (and strongly preferred), you should stream the uncompressed file through your program so that you may constrain the buffer|in-memory window onto it thereby limiting the possibility that you'll run out of memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM