I have a folder with many zip files and within those zip files are multiple csv files. Is there any way to get all of the .csv files in one dataframe in python? Or any way I can pass a list of zip files?
The code I am currently trying is:
import glob
import zipfile
import pandas as pd
for zip_file in glob.glob(r"C:\Users\harsh\Desktop\Temp\data_00-01.zip"):
# This is just one file. There are multiple zip files in the folder
zf = zipfile.ZipFile(zip_file)
dfs = [pd.read_csv(zf.open(f), header=None, sep=";", encoding='latin1') for f in zf.namelist()]
df = pd.concat(dfs,ignore_index=True)
print(df)
This code works for one zipfile but I have about 50 zip files in the folder and I would like to read and concatenate all csv files in those zip files in one dataframe.
Thanks
The following code should satisfy your requirements (just edit dir_name
according to what you need):
import glob
import zipfile
import pandas as pd
dfs = []
for filename in os.listdir(dir_name):
if filename.endswith('.zip'):
zip_file = os.path.join(dir_name, filename)
zf = zipfile.ZipFile(zip_file)
dfs += [pd.read_csv(zf.open(f), header=None, sep=";", encoding='latin1') for f in zf.namelist()]
df = pd.concat(dfs,ignore_index=True)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.