[英]Handling multiple column headers and same column names in csv - pandas/python
I have a csv file that looks like this我有一个看起来像这样的 csv 文件
PROD1 PROD1 PROD2 PROD2
X Y X Y
AA A 1 2 9 10
BB B 3 4 11 12
CC C 5 6 13 14
DD D 7 8 15 16
The output I am trying to get has to look like this我试图获得的输出必须看起来像这样
X Y
AA A PROD1 1 2
BB B PROD1 3 4
CC C PROD1 5 6
DD D PROD1 7 8
AA A PROD2 9 10
BB B PROD2 11 12
CC C PROD2 13 14
DD D PROD2 15 16
I tried transposing the csv read with我尝试将读取的 csv 转置
data=pd.read_csv('transposedata.csv', header=None).T
But then I lose column info.但是后来我丢失了列信息。 I also tried this from another solution provided here at stackoverflow
我也从 stackoverflow 上提供的另一个解决方案中尝试了这个
df = pd.read_csv('transposedata.csv', header=[0,1])
a = df.columns.get_level_values(0).to_series()
b = a.mask(a.str.startswith('Unnamed')).ffill().fillna('')
df.columns = [b, df.columns.get_level_values(1)]
I end up with我结束了
PROD1 PROD2
Unnamed: 0_level_1 Unnamed: 1_level_1 X Y X Y
0 AA A 1 2 9 10
1 BB B 3 4 11 12
2 CC C 5 6 13 14
3 DD D 7 8 15 16
Any Help?任何帮助?
update when I run the solution given当我运行给定的解决方案时更新
data=pd.read_csv('transposedata1.csv', header=[0,1]).stack(level=0).sort_index(level=1)
i get this我明白了
Unnamed:0_level_1 Unnamed:1_level_1 X Y
0 PROD1 NaN NaN 1 2
1 PROD1 NaN NaN 3 4
2 PROD1 NaN NaN 5 6
3 PROD1 NaN NaN 7 8
0 PROD2 NaN NaN 9 10
1 PROD2 NaN NaN 11 12
2 PROD2 NaN NaN 13 14
3 PROD2 NaN NaN 15 16
0 Unnamed:0_level_0 AA NaN NaN NaN
1 Unnamed:0_level_0 BB NaN NaN NaN
2 Unnamed:0_level_0 CC NaN NaN NaN
3 Unnamed:0_level_0 DD NaN NaN NaN
0 Unnamed:1_level_0 NaN A NaN NaN
1 Unnamed:1_level_0 NaN B NaN NaN
2 Unnamed:1_level_0 NaN C NaN NaN
3 Unnamed:1_level_0 NaN D NaN NaN
Thanks谢谢
You do not want to transpose the dataframe but stack one column level.您不想转置数据框而是堆叠一列级别。 Simply you must declare to pandas that the csv file has a 2 rows header:
只需向 Pandas 声明 csv 文件有 2 行标题:
data=pd.read_csv('transposedata.csv', header=[0,1]).stack(level=0).sort_index(level=2)
It should give:它应该给出:
X Y
AA A PROD1 1 2
BB B PROD1 3 4
CC C PROD1 5 6
DD D PROD1 7 8
AA A PROD2 9 10
BB B PROD2 11 12
CC C PROD2 13 14
DD D PROD2 15 16
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.