简体   繁体   English

从 zip 文件夹中的文件夹导入 csv 文件

[英]Import csv files from folders inside zip folder

I have zip folder namely zip_file.zip, and thousands of folders are inside of it.我有 zip 文件夹,即 zip_file.zip,其中有数千个文件夹。 There are also thousands of .csv files inside these folders and I want to import all csv files and concat them.这些文件夹中还有数千个 .csv 文件,我想导入所有 csv 文件并将它们连接起来。 I tried the solution that I found in Stackoverflow but it doesn't work.我尝试了在 Stackoverflow 中找到的解决方案,但它不起作用。 Could you please help?能否请你帮忙?

import zipfile
import pandas as pd
import glob

path = zipfile.ZipFile('/zip_file.zip')
all_files = all_files = glob.glob(path + "/*.csv")
li = []
for filename in all_files:
    df = pd.read_csv(filename, index_col=None, header=0)
    li.append(df)

frame = pd.concat(li, axis=0, ignore_index=True)

One option is to use dask which will use fsspec under the hood for complex read situations:一种选择是使用dask ,它将在后台使用fsspec来处理复杂的读取情况:

from dask.dataframe import read_csv

# this line will create a pandas dataframe
df = read_csv('zip://*.csv::zip_file.zip').compute()

Note that .compute call assumes that the data fits into memory.请注意, .compute调用假定数据适合内存。 If this is not the case, you will need to think further about how you want the data to be processed.如果不是这种情况,您将需要进一步考虑您希望如何处理数据。

Also, the above assumes that you have dask installed, if not, install it in the terminal/shell via pip (or conda):此外,以上假设您已经安装了 dask,如果没有,请通过 pip(或 conda)将其安装在终端/shell 中:

pip install dask

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据条件从另一个文件夹内的文件夹中导入 csv 文件 - Import csv files based on condition from the folders that inside another folder 如何只压缩文件夹中的文件而不压缩子文件夹? - How to zip only the files inside the folder and not the sub folders? Zip 来自不同项目的文件和文件夹并导入另一个项目 - Zip files and folders from different project and import in another project 根据 csv 中的文件夹名称列表从文件夹中提取文件 - Extract files from folders according to folder name list from csv Python:将相同的.csv文件从各个文件夹(每个文件夹有一个.csv文件)复制到一个文件夹中 - Python: Copy identical .csv files from various folders (each folder has one .csv file) into a single folder 如何从 zip 文件的多个文件夹访问多个 CSV 文件 - How to access multiple CSV files that share the same name from multiple folders from a zip file 如何使用python从位于同一目录中的多个zip文件夹中读取csv文件? - How to read csv files from multiple zip folders located in the same directory using python? 如何解压缩所有以.zip结尾的文件夹/文件,并从每个压缩的文件夹中提取“ file.txt”文件 - How to unzip all folders/files that end in .zip and extract “file.txt” file from each zipped folder 如何在多平台环境中从文件夹中导入多个 csv 文件 - How to import several csv files from a folder, within a multiplatform environment 从Zip存档中提取文件夹中存在的文件 - Extract files that exist in folders from Zip Archive
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM