繁体 English 中英

用python读取多个文件

[英]Reading multiple files in python

原文 2018-02-24 15:30:38 2 1 python/ filereader

我有一个超过 30 万个文件的数据集，我需要读取这些文件并将其附加到字典中。

corpus_path = "data"
article_paths = [os.path.join(corpus_path,p) for p in os.listdir(corpus_path)]

doc = []
for path in article_paths:
    dp = pd.read_table(path, header=None, encoding='utf-8', quoting=3, error_bad_lines=False)
    doc.append(dp)

有没有更快的方法来做到这一点，因为当前的方法需要一个多小时。

1 个解决方案

您可以使用 多处理模块。

from multiprocessing import Pool

def readFile(path):
    return pd.read_table(path, header=None, encoding='utf-8', quoting=3, error_bad_lines=False)


result = list(Pool(processes=nprocs).imap(readFile, article_paths))  #nprocs = Number of processors

Python：读写多个文件

[英]Python: reading and writing multiple files

在python中读取多个依赖文件

[英]Reading multiple depended files in python

在python中一次读取多个文件

[英]Reading multiple files at once in python

多个文件中的 Reading.env [Python]

[英]Reading .env in multiple files [Python]

python编程-从多个文件读取

[英]python programming-reading from multiple files

读取多个 CSV 文件并合并 Python Pandas

[英]Reading multiple CSV files and merge Python Pandas

使用numpy在python中读取多个带有标题的CSV文件？

[英]Reading multiple CSV files with headers in python with numpy?

Python 和 Dask - 读取和连接多个文件

[英]Python and Dask - reading and concatenating multiple files

如何在 python 中返回，读取多个.xml 文件

[英]How to Return in python, reading multiple .xml files

在python中读取和操作多个netcdf文件

[英]Reading and manipulating multiple netcdf files in python

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python：读写多个文件在python中读取多个依赖文件在python中一次读取多个文件多个文件中的 Reading.env [Python] python编程-从多个文件读取读取多个 CSV 文件并合并 Python Pandas 使用numpy在python中读取多个带有标题的CSV文件？ Python 和 Dask - 读取和连接多个文件如何在 python 中返回，读取多个.xml 文件在python中读取和操作多个netcdf文件

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM