[英]Python concat columns
我有 3 个类似的表格,每个表格都显示了每 24 小时的一些价差数字。 我想将它们组合成 1 个表,以便比较这 3 个表。
所以结果应该有 4 列和 25 行,而第一行和第一列是标题
以及如何在合并后更改每 3 列的标题?
import pandas as pd
hour = ['00', '01', '02', '03', '04', '06', '07', '08', '09', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21',
'22', '23']
spread = [27.461988, 2.144416, 0.970719, 0.883571, 1.234078, 0.747148,
0.660058, 1.025625, 0.660939, 0.600193, 0.412775, 0.503613, 0.468141,
0.417250, 0.366429, 0.414767, 0.295326, 0.289255, 0.091598, 0.312621,
0.393910, 0.490924, 0.425078, 1.350392]
df = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df.set_index("hour", inplace = True)
spread
hour
00 27.461988
01 2.144416
02 0.970719
03 0.883571
04 1.234078
06 0.747148
07 0.660058
08 1.025625
09 0.660939
10 0.600193
11 0.412775
12 0.503613
13 0.468141
14 0.417250
15 0.366429
16 0.414767
17 0.295326
18 0.289255
19 0.091598
20 0.312621
21 0.393910
22 0.490924
23 0.425078
使用 pandas 每小时进行内部连接。
(pd.merge(
pd.merge(df1,
df2, on=["hour"]),
df3, on=["hour"])
)
您可以在覆盖之前的值的 concat 之后命名它们:
(见下文具有相同效果的重命名)
具有相同列名的 3 个数据框的第一代:
import pandas as pd
hour = ['00', '01', '02', '03', '04', '06', '07', '08', '09', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21',
'22', '23']
spread = [27.461988, 2.144416, 0.970719, 0.883571, 1.234078, 0.747148,
0.660058, 1.025625, 0.660939, 0.600193, 0.412775, 0.503613, 0.468141,
0.417250, 0.366429, 0.414767, 0.295326, 0.289255, 0.091598, 0.312621,
0.393910, 0.490924, 0.425078, 1.350392]
df_1 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_1.set_index("hour", inplace = True)
spread = spread[3:] + spread[:3]
df_2 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_2.set_index("hour", inplace = True)
spread = [ x/2 for x in spread]
df_3 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_3.set_index("hour", inplace = True)
2nd concat,注意结果中的通用列:
df_concat = pd.concat([df_1, df_2, df_3],
ignore_index=True, axis=1)
df_concat.head(3)
0 1 2
hour
00 27.461988 0.883571 0.441785
01 2.144416 1.234078 0.617039
02 0.970719 0.747148 0.373574
第三(重新)命名列 - 覆盖任何以前的名称:
df_concat.columns =['spread_1', 'spread_2', 'spread_3']
df_concat.head(3)
spread_1 spread_2 spread_3
hour
00 27.461988 0.883571 0.441785
01 2.144416 1.234078 0.617039
02 0.970719 0.747148 0.373574
你也可以使用.rename
来达到同样的效果:
df_concat.rename(columns={df_concat.columns[0]: "spread_1",
df_concat.columns[1]: "spread_2",
df_concat.columns[2]: "spread_3"}, inplace = True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.