繁体   English   中英

Python 连接列

[英]Python concat columns

我有 3 个类似的表格,每个表格都显示了每 24 小时的一些价差数字。 我想将它们组合成 1 个表,以便比较这 3 个表。

所以结果应该有 4 列和 25 行,而第一行和第一列是标题

以及如何在合并后更改每 3 列的标题?

import pandas as pd

hour = ['00', '01', '02', '03', '04', '06', '07', '08', '09', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21',
'22', '23']

spread = [27.461988, 2.144416, 0.970719, 0.883571, 1.234078, 0.747148,
0.660058, 1.025625, 0.660939, 0.600193, 0.412775, 0.503613, 0.468141,
0.417250, 0.366429, 0.414767, 0.295326, 0.289255, 0.091598, 0.312621,
0.393910, 0.490924, 0.425078, 1.350392]

df = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df.set_index("hour", inplace = True)
    spread
hour    
00  27.461988
01  2.144416
02  0.970719
03  0.883571
04  1.234078
06  0.747148
07  0.660058
08  1.025625
09  0.660939
10  0.600193
11  0.412775
12  0.503613
13  0.468141
14  0.417250
15  0.366429
16  0.414767
17  0.295326
18  0.289255
19  0.091598
20  0.312621
21  0.393910
22  0.490924
23  0.425078

使用 pandas 每小时进行内部连接。

(pd.merge(
         pd.merge(df1, 
             df2, on=["hour"]), 
         df3, on=["hour"])
)

您可以在覆盖之前的值的 concat 之后命名它们:
(见下文具有相同效果的重命名)

具有相同列名的 3 个数据框的第一代:

import pandas as pd

hour = ['00', '01', '02', '03', '04', '06', '07', '08', '09', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21',
'22', '23']

spread = [27.461988, 2.144416, 0.970719, 0.883571, 1.234078, 0.747148,
0.660058, 1.025625, 0.660939, 0.600193, 0.412775, 0.503613, 0.468141,
0.417250, 0.366429, 0.414767, 0.295326, 0.289255, 0.091598, 0.312621,
0.393910, 0.490924, 0.425078, 1.350392]

df_1 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_1.set_index("hour", inplace = True)

spread = spread[3:] + spread[:3]
df_2 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_2.set_index("hour", inplace = True)

spread = [ x/2 for x in spread] 
df_3 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_3.set_index("hour", inplace = True)

2nd concat,注意结果中的通用列:

df_concat = pd.concat([df_1, df_2, df_3],
                      ignore_index=True, axis=1)
df_concat.head(3)
              0         1         2
hour                               
00    27.461988  0.883571  0.441785
01     2.144416  1.234078  0.617039
02     0.970719  0.747148  0.373574

第三(重新)命名列 - 覆盖任何以前的名称:

df_concat.columns =['spread_1', 'spread_2', 'spread_3']
df_concat.head(3)
    spread_1    spread_2    spread_3
hour            
00  27.461988   0.883571    0.441785
01  2.144416    1.234078    0.617039
02  0.970719    0.747148    0.373574

你也可以使用.rename来达到同样的效果:

df_concat.rename(columns={df_concat.columns[0]: "spread_1",
                          df_concat.columns[1]: "spread_2",
                          df_concat.columns[2]: "spread_3"}, inplace = True)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM