[英]Pandas read excel files in folder and Unpivot columns into Dataframe
我在具有不同列名和數據類型的文件夾中有大約(100 個文件 +)XLSX 文件
文件 1:
Id test category
1 ab 4
2 cs 3
3 cs 1
文件 2:
index remove stocks category
1 dr 4 a
2 as 3 b
3 ae 1 v
文件 3:……
文件 4……
這是我基於另一個例子的嘗試:
# current directory (including python script & all excel files)
mydir = (os.getcwd()).replace('\\','/') + '/'
#Get all excel files include subdir
filelist=[]
for path, subdirs, files in os.walk(mydir):
for file in files:
if (file.endswith('.xlsx') or file.endswith('.xls') or file.endswith('.XLS')):
filelist.append(os.path.join(path, file))
number_of_files=len(filelist)
print(filelist)
# Read all excel files and save to dataframe (df[0] - df[x]),
# x is the number of excel files that have been read - 1
df=[]
for i in range(number_of_files):
try:
df.melt(pd.read_excel(r''+filelist[i]))
except:
print('Empty Excel File')
print(df)
結果:
Empty Excel File
Empty Excel File
Empty Excel File
Empty Excel File
[]
我怎樣才能取消數據透視而不是“附加”列中的數據?
我想將我的所有文件數據轉為這種數據框格式。
數據框:
Id 1
Id 2
Id 3
test ab
test cs
test cs
category 4
category 3
category 1
index 1
index 1
index 1
remove dr
remove as
remove ae
stocks 4
stocks 3
stocks 1
category a
category b
category v
我已經用您的示例輸入對其進行了測試:
one={"Id": [1,2,3], "test": ["ab","cs","cs"], "category": [4,3,1]}
two= {"index": [1,2,3], "remove": ["dr","as","ae"], "stocks": [4,3,1], "category": ["a", "b", "v"]}
df1 = pd.DataFrame(one)
df2 = pd.DataFrame(two)
final = pd.concat([df1.melt(),df2.melt()])
final:
variable value
0 Id 1
1 Id 2
2 Id 3
3 test ab
4 test cs
5 test cs
6 category 4
7 category 3
8 category 1
0 index 1
1 index 2
2 index 3
3 remove dr
4 remove as
5 remove ae
6 stocks 4
7 stocks 3
8 stocks 1
9 category a
10 category b
11 category v
你可以使用:
import pandas as pd
import pathlib
data = []
for filename in pathlib.Path.cwd().iterdir():
if filename.suffix.lower().startswith('.xls'):
data.append(pd.read_excel(filename).melt())
df = pd.concat(data, ignore_index=True)
輸出:
>>> df
variable value
0 Id 1
1 Id 2
2 Id 3
3 test ab
4 test cs
5 test cs
6 category 4
7 category 3
8 category 1
9 index 1
10 index 2
11 index 3
12 remove dr
13 remove as
14 remove ae
15 stocks 4
16 stocks 3
17 stocks 1
18 category a
19 category b
20 category v
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.