![](/img/trans.png)
[英]How can I read multiple csv files from a single directory and graph them separately in Python?
[英]How to read multiple files from a directory and append them in python?
我正在處理文件夾中的文件,我需要更好的方法來循環文件和 append 列來制作主文件。 對於兩個文件,我使用讀取作為兩個 dataframe 和附加系列。 但是現在我遇到了超過 100 個文件的情況。 文件1如下:
Num Department Product Salesman Location rating1
1 Electronics TV 3 Bigmart, Delhi 5
2 Electronics TV 1 Bigmart, Mumbai 4
3 Electronics TV 2 Bigmart, Bihar 3
4 Electronics TV 2 Bigmart, Chandigarh 5
5 Electronics Camera 2 Bigmart, Jharkhand 5
similary file 2:
Num Department Product Salesman Location rating2
1 Electronics TV 3 Bigmart, Delhi 2
2 Electronics TV 1 Bigmart, Mumbai 4
3 Electronics TV 2 Bigmart, Bihar 4
4 Electronics TV 2 Bigmart, Chandigarh 5
5 Electronics Camera 2 Bigmart, Jharkhand 3
我想要實現的是從所有其他文件和 append 垂直讀取評級列。 預期的:
Num Department Product Salesman Location rating1 rating2
1 Electronics TV 3 Bigmart, Delhi 5 2
2 Electronics TV 1 Bigmart, Mumbai 4 4
3 Electronics TV 2 Bigmart, Bihar 3 5
4 Electronics TV 2 Bigmart, Chandigarh 5 5
5 Electronics Camera 2 Bigmart, Jharkhand 5 3
我修改了這里發布的一些代碼。 以下代碼有效:
def read_folder(folder):
files = [i for i in os.listdir(folder) if 'xlsx' in i]
df = pd.read_excel(folder+'/{}'.format(files[0]))
for f in files[1:]:
df2 = pd.read_excel(folder+'/{}'.format(f))
df = df.merge(df2.iloc[:,5],left_index=True,right_index=True)
return df
此方法讀取文件夾並返回 pandas dataframe 中的所有內容
import pandas as pd
import os
def read_folder(csv_folder)
files = os.listdir(csv_folder)
df = []
for f in files:
print(f)
csv_file = csv_folder + "/" + f
df.append(pd.read_csv(csv_file))
df_full = pd.concat(df, ignore_index=True)
return df, full
據我了解您的最后評論,您需要添加評級列並創建一個文件。 讀取所有文件后,您可以進行以下操作。
final_df = df[0]
i = 1
for d in df[1:]:
final_df["rating_"+i] = d["rating"]
i = i+1
此版本的read_folder()
返回數據幀列表。 它還添加了一個幫助列(用於評級)。
import pandas as pd
from pathlib import Path
def read_folder(csv_folder):
''' Input is a folder with csv files; return list of data frames.'''
csv_folder = Path(csv_folder).absolute()
csv_files = [f for f in csv_folder.iterdir() if f.name.endswith('csv')]
# the assign() method adds a helper column
dfs = [
pd.read_csv(csv_file).assign(rating_src = f'rating-{idx}')
for idx, csv_file in enumerate(csv_files, 1)
]
return dfs
現在將數據框組裝成所需的形狀:
dfs = read_folder(csv_folder)
dfs = (pd.concat((d for d in dfs))
.set_index(['Num', 'Department', 'Product', 'Salesman', 'Location', 'rating_src'])
.squeeze()
.unstack(level='rating_src')
.reset_index()
)
dfs.columns.name = ''
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.