[英]I have different zip files that contain one csv file each. how do I unzip each folder, and save all the csv files in one folder
The codes I have written, for some reasons does not work. 由于某些原因,我编写的代码无法正常工作。
import pandas as pd
import glob
import zipfile
path = r"C:/Users/nano/Documents/Project" # use your path
all_files = glob.glob(path + "/*.gz")
for folder in all_files:
with zipfile.ZipFile(folder,"r") as zip_ref:
zip_ref.extractall(path)
First you are using Zip against Gzip. 首先,您针对Gzip使用Zip。 So you need to use the right library. 因此,您需要使用正确的库。 Below is a working example of the code. 下面是该代码的一个工作示例。
import glob
import os
import gzip
path = r"C:/Temp/Unzip" # use your path
all_files = glob.glob(path + "/*.gz")
print(all_files)
for file in all_files:
path, filename = os.path.split(file)
filename = os.path.splitext(filename)[0]
with gzip.open(file,"rb") as gz:
with open('{0}/{1}.csv'.format(path, filename), 'wb') as cv:
cv.writelines(gz.read())
gzip (.gz) and zip (.zip) are two different things. gzip(.gz)和zip(.zip)是两个不同的东西。 For gzip, you can use gzip
: 对于gzip,可以使用gzip
:
import glob
import gzip
import shutil
path = r"C:/Users/shedez/Documents/Project" # use your path
all_files = glob.glob(path + "/*.gz")
for folder in all_files:
dst=folder[:-3] # destination file name
with gzip.open(folder, 'rb') as f_in, open(dst, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
If you use gz (gZip) format, you might want to look at the gzip
package, I'm not aware of an extract method, but you can do something as such, using pandas purely, which i find more convenient: 如果您使用gz(gZip)格式,则可能要看一下gzip
包,我不知道提取方法,但是您可以使用纯熊猫做类似的事情,我觉得这更方便:
for folder in all_files:
c = pd.read_csv(folder, compression='gzip')
c.to_csv(path+folder[:-2]+"csv")
the [:-2] is to cut the "gz", and you might want to either change the parameters of read_csv (adding header row, or whatever) or the flags of to_csv (setting the arguments header=False, index_label=False
to prevent panda adding you undesired stuff [:-2]是要剪切“ gz”的,您可能想要更改read_csv的参数(添加标头行或其他内容)或to_csv的标志(将参数header=False, index_label=False
为防止熊猫添加您不想要的东西
alternatively, you could open it with gzip
或者,您可以使用gzip
打开它
import gzip
import shutil
with open(folder, 'rb') as f_in, gzip.open(folder[:-2]+"csv", 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
Try out this code: 试用以下代码:
import os, zipfile
dir_name = 'C:\\Users\\shedez\\Documents\\Project' # ZIP location
extract_dir_name = 'C:\\Users\\shedez\\Documents\\Project\\Unziped' # CSV location after unzip
extension = ".zip" # you might have to change this
os.chdir(dir_name) # change directory from working dir to dir with files
for item in os.listdir(dir_name): # loop through items in dir
if item.endswith(extension): # check for ".zip" extension
file_name = os.path.abspath(item) # get full path of files
zip_ref = zipfile.ZipFile(file_name) # create zipfile object
zip_ref.extractall(extract_dir_name) # extract file to dir
zip_ref.close() # close file
If you want to learn more about zipFile , click here . 如果您想了解有关zipFile的更多信息,请单击此处 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.