繁体   English   中英

如何将 DataFrame 的几列重新格式化为一行?

[英]How do I reformat several columns of a DataFrame into one row?

这是我正在使用的数据框的片段:

    type        date       time      open     close    change  high   low     200ema 50ema
0  sixty-min  2007-06-04  09:00:00  1536.28  1534.71  -0.102  0.000 -0.259     NaN  1522.90
1  sixty-min  2007-06-04  10:00:00  1534.87  1534.79  -0.005  0.109 -0.106     NaN  1523.37
2  sixty-min  2007-06-04  11:00:00  1534.88  1536.08   0.078  0.124 -0.023     NaN  1523.87
3  sixty-min  2007-06-04  12:00:00  1536.21  1537.30   0.071  0.118 -0.036     NaN  1524.39
4  sixty-min  2007-06-04  13:00:00  1537.31  1536.23  -0.070  0.011 -0.130     NaN  1524.86
5  sixty-min  2007-06-04  14:00:00  1536.25  1536.91   0.043  0.096 -0.078     NaN  1525.33
6  sixty-min  2007-06-04  15:00:00  1536.53  1539.10   0.167  0.260  0.000     NaN  1525.87
7  sixty-min  2007-06-04  16:00:00  1539.00  1539.18   0.012  0.012  0.000     NaN  1526.39
8  sixty-min  2007-06-05  09:00:00  1539.12  1533.15  -0.389  0.000 -0.456     NaN  1526.66
9  sixty-min  2007-06-05  10:00:00  1533.16  1534.77   0.105  0.160 -0.178     NaN  1526.97

我想做的是将此数据框编译成只有一行的数据框。 它将具有以下列:

[ 'date' '60 9 open,'   '60 9 close,'   '60 9 change,'  '60 9 high',    '60 9 low', '60 9 200ema',  '60 9 50ema', 
'60 10 open',   '60 10 close',  '60 10 change', '60 10 high',   '60 10 low',    '60 10 200ema', '60 10 50ema',
'60 11 open',   '60 11 close',  '60 11 change', '60 11 high',   '60 11 low',    '60 11 200ema', '60 11 50ema',
'60 12 open',   '60 12 close',  '60 12 change', '60 12 high',   '60 12 low',    '60 12 200ema', '60 12 50ema',
'60 13 open',   '60 13 close',  '60 13 change', '60 13 high',   '60 13 low',    '60 13 200ema', '60 13 50ema',
'60 14 open',   '60 14 close',  '60 14 change', '60 14 high',   '60 14 low',    '60 14 200ema', '60 14 50ema',
'60 15 open',   '60 15 close',  '60 15 change', '60 15 high',   '60 15 low',    '60 15 200ema', '60 15 50ema',
'60 16 open',   '60 16 close',  '60 16 change', '60 16 high',   '60 16 low',    '60 16 200ema', '60 16 50ema',]

不同之处在于行上只有一个日期并且没有类型,并且有一个基于每个单元格的数据类型/时间的标题。

您可以先将小时提取为 int 并按日期分组:

df['time'] = pd.to_datetime(df['time']).dt.hour
df = df.groupby('date').agg(list)

然后对于每个日期连接(沿列/轴1)从每列创建的数据框。 最后连接(沿行/axis0)所有日期的数据框:

df_out = pd.concat([
    pd.concat([pd.DataFrame([row[col]], index=[index],
            columns=[f'60 {h} {col}' for h in row['time']])
        for col in row.index[1:]], axis=1) 
    for index, row in df.iterrows()
])

输出:

            60 9 open  60 10 open  60 11 open  60 12 open  60 13 open  60 14 open  60 15 open  ...  60 10 50ema  60 11 50ema  60 12 50ema  60 13 50ema  60 14 50ema  60 15 50ema  60 16 50ema
2007-06-04    1536.28     1534.87     1534.88     1536.21     1537.31     1536.25     1536.53  ...      1523.37      1523.87      1524.39      1524.86      1525.33      1525.87      1526.39
2007-06-05    1539.12     1533.16         NaN         NaN         NaN         NaN         NaN  ...      1526.97          NaN          NaN          NaN          NaN          NaN          NaN

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM