簡體   English   中英

Python 連接列

[英]Python concat columns

我有 3 個類似的表格,每個表格都顯示了每 24 小時的一些價差數字。 我想將它們組合成 1 個表,以便比較這 3 個表。

所以結果應該有 4 列和 25 行,而第一行和第一列是標題

以及如何在合並后更改每 3 列的標題?

import pandas as pd

hour = ['00', '01', '02', '03', '04', '06', '07', '08', '09', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21',
'22', '23']

spread = [27.461988, 2.144416, 0.970719, 0.883571, 1.234078, 0.747148,
0.660058, 1.025625, 0.660939, 0.600193, 0.412775, 0.503613, 0.468141,
0.417250, 0.366429, 0.414767, 0.295326, 0.289255, 0.091598, 0.312621,
0.393910, 0.490924, 0.425078, 1.350392]

df = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df.set_index("hour", inplace = True)
    spread
hour    
00  27.461988
01  2.144416
02  0.970719
03  0.883571
04  1.234078
06  0.747148
07  0.660058
08  1.025625
09  0.660939
10  0.600193
11  0.412775
12  0.503613
13  0.468141
14  0.417250
15  0.366429
16  0.414767
17  0.295326
18  0.289255
19  0.091598
20  0.312621
21  0.393910
22  0.490924
23  0.425078

使用 pandas 每小時進行內部連接。

(pd.merge(
         pd.merge(df1, 
             df2, on=["hour"]), 
         df3, on=["hour"])
)

您可以在覆蓋之前的值的 concat 之后命名它們:
(見下文具有相同效果的重命名)

具有相同列名的 3 個數據框的第一代:

import pandas as pd

hour = ['00', '01', '02', '03', '04', '06', '07', '08', '09', '10',
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21',
'22', '23']

spread = [27.461988, 2.144416, 0.970719, 0.883571, 1.234078, 0.747148,
0.660058, 1.025625, 0.660939, 0.600193, 0.412775, 0.503613, 0.468141,
0.417250, 0.366429, 0.414767, 0.295326, 0.289255, 0.091598, 0.312621,
0.393910, 0.490924, 0.425078, 1.350392]

df_1 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_1.set_index("hour", inplace = True)

spread = spread[3:] + spread[:3]
df_2 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_2.set_index("hour", inplace = True)

spread = [ x/2 for x in spread] 
df_3 = pd.DataFrame(list(zip(hour, spread)),columns = ['hour','spread'])
df_3.set_index("hour", inplace = True)

2nd concat,注意結果中的通用列:

df_concat = pd.concat([df_1, df_2, df_3],
                      ignore_index=True, axis=1)
df_concat.head(3)
              0         1         2
hour                               
00    27.461988  0.883571  0.441785
01     2.144416  1.234078  0.617039
02     0.970719  0.747148  0.373574

第三(重新)命名列 - 覆蓋任何以前的名稱:

df_concat.columns =['spread_1', 'spread_2', 'spread_3']
df_concat.head(3)
    spread_1    spread_2    spread_3
hour            
00  27.461988   0.883571    0.441785
01  2.144416    1.234078    0.617039
02  0.970719    0.747148    0.373574

你也可以使用.rename來達到同樣的效果:

df_concat.rename(columns={df_concat.columns[0]: "spread_1",
                          df_concat.columns[1]: "spread_2",
                          df_concat.columns[2]: "spread_3"}, inplace = True)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM