[英]How to open folder and place text files in dataframe and rename dataframe based on file name?
我正在嘗試打開一個包含多個文本文件的文件夾,並將每個文件放在其自己的數據框中,並通過文件名命名每個數據框。
到目前為止,我的代碼正在識別該文件夾中的5個文件,但沒有根據文件名將文件中的數據放入數據框。 有人可以告訴我該怎么做嗎?
代碼:導入os導入熊貓為pd導入pypyodbc
loc = 'D:/filepath to folder with files'
os.chdir(loc)
filelist = os.listdir()
#print (len((pd.concat([pd.read_csv(item, names=[item[:-4]]) for item in filelist],axis=1))))
data = []
path = loc
files = [f for f in os.listdir(path) if os.path.isfile(f)]
for f in files:
with open(f,'r') as myfile:
data.append(myfile.read())
df = pd.DataFrame(data)
print (df.shape)
先感謝您
編輯文件中的數據外觀:
0010010000013 1 CITY OF HOUSTON 1.000
0010020000001 1 CURRENT OWNER 1.000
0010020000003 1 MILBY CHARLES FAMILY PTNSH 1.000
0010020000004 1 FEAGIN MICHAEL RYAN TRUST 1.000
0010020000013 1 BUFFALO BAYOU PARTNERSHIP 1.000
0010020000015 1 BUFFALO BAYOU PARTNERSHIP 1.000
0010020000016 1 USRP PAC LP SPAGHETTI WAREHOUSE 1.000
0010020000023 1 CITY OF HOUSTON 1.000
0010020000024 1 LUISA MILBY FEAGIN 2007 TRUST 1.000
0010030000001 1 BUFFALO BAYOU PARTNERSHIP 1.000
-編輯-最終答案
dfs = {os.path.basename(f): pd.read_csv(f, sep='\t', header=None,encoding='cp037',error_bad_lines=False) for f in glob.glob('D:/TX/Houston_County/Real_acct_owner/*.txt')}
這樣的事情應該創建一個dict,其中每個鍵(=文件名)保存具有相應文件內容的數據框。
filedfs = {}
for f in files: filedfs[f] = pd.read_csv(os.path.join(loc, f))
或者,作為@MaxU提出的單行代碼:
dfs = {os.path.basename(f): pd.read_csv(f, delim_whitespace=True, header=None) for f in glob.glob('c:/data/*.csv')}
輸入:
0010010000013,1,CITY OF HOUSTON,1.000
0010020000001,1,CURRENT OWNER,1.000
碼:
import os
import pandas
loc = 'folder/'
list_of_df = []
for f in os.listdir(loc):
if f.endswith(".txt"):
df = pandas.read_csv(loc+f, sep = ',', names = ['number', 'count', 'buyer', 'status'])
list_of_df.append(df)
for df in list_of_df:
print df
print '--'
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.