[英]read multiple txt files python
我有 6000 個 txt 文件要在 python 中讀取。 我正在嘗試閱讀,但所有 txt 文件都是一行一行的。
Subject: key dates and impact of upcoming sap implementation over the next few weeks , project apollo and beyond will conduct its final sap implementation ) this implementation will impact approximately 12 , 000 new users plus all existing system users . sap brings a new dynamic to enron , enhancing the timely flow and sharing of specific project , human resources , procurement , and financial information across business units and across continents . this final implementation will retire multiple , disparate systems and replace them with a common , integrated system encompassing many processes including payroll , timekeeping ...
因此,當我一個一個地讀取文件時,python 將其分隔為行(我知道那是可笑的)。 最后,1 封郵件划分了多行。 我已經嘗試read_csv
所有 txt 文件,但 python 給出錯誤ValueError: stat: path too long for Windows
。 我不知道從現在開始我該怎么做。
我試過這個:
import glob
import errno
path =r'C:\Users\frknk\OneDrive\Masaüstü\enron6\emails\*.txt'
files = glob.glob(path)
for name in files:
try:
with open(name) as f:
for line in f:
print(line.split())
except IOError as exc:
if exc.errno != errno.EISDIR:
raise
['Subject:', 'key', 'dates', 'and', 'impact', 'of', 'upcoming', 'sap', 'implementation']
['over', 'the', 'next', 'few', 'weeks', ',', 'project', 'apollo', 'and', 'beyond', 'will', 'conduct', 'its', 'final', 'sap']
我需要通過電子郵件發送這封電子郵件,但它是逐行分隔的。 所以我想要的是每一行由一封電子郵件表示。
您可以將整個文本文件讀入一個變量,然后根據需要進行操作。 只需用data=f.read()
替換for line in f
。所以,下面我將每個 txt 文件讀入 data 變量,然后我拆分以獲取由“”分隔的單詞。 希望這可以幫助。
for name in files:
try:
with open(name) as f:
data = f.read().replace("\n","")
print(data.split())
except IOError as exc:
if exc.errno != errno.EISDIR:
raise
輸出將如下所示:
['Subject:', 'key', 'dates', 'and', 'impact', 'of', 'upcoming', 'sap', 'implementationover', 'the', 'next', 'few', 'weeks', ',', 'project', 'apollo', 'and', 'beyond', 'will', 'conduct', 'its', 'final', 'sapimplementation', ')', 'this', 'implementation', 'will', 'impact', 'approximately', '12', ',', '000', 'newusers', 'plus', 'all', 'existing', 'system', 'users', '.', 'sap', 'brings', 'a', 'new', 'dynamic', 'to', 'enron', ',enhancing', 'the', 'timely', 'flow', 'and', 'sharing', 'of', 'specific', 'project', ',', 'human', 'resources', ',procurement', ',', 'and', 'financial', 'information', 'across', 'business', 'units', 'and', 'acrosscontinents', '.this', 'final', 'implementation', 'will', 'retire', 'multiple', ',', 'disparate', 'systems', 'and', 'replacethem', 'with', 'a', 'common', ',', 'integrated', 'system', 'encompassing', 'many', 'processes', 'includingpayroll', ',', 'timekeeping', '...']```
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.