[英]How do I reformat several columns of a DataFrame into one row?
这是我正在使用的数据框的片段:
type date time open close change high low 200ema 50ema
0 sixty-min 2007-06-04 09:00:00 1536.28 1534.71 -0.102 0.000 -0.259 NaN 1522.90
1 sixty-min 2007-06-04 10:00:00 1534.87 1534.79 -0.005 0.109 -0.106 NaN 1523.37
2 sixty-min 2007-06-04 11:00:00 1534.88 1536.08 0.078 0.124 -0.023 NaN 1523.87
3 sixty-min 2007-06-04 12:00:00 1536.21 1537.30 0.071 0.118 -0.036 NaN 1524.39
4 sixty-min 2007-06-04 13:00:00 1537.31 1536.23 -0.070 0.011 -0.130 NaN 1524.86
5 sixty-min 2007-06-04 14:00:00 1536.25 1536.91 0.043 0.096 -0.078 NaN 1525.33
6 sixty-min 2007-06-04 15:00:00 1536.53 1539.10 0.167 0.260 0.000 NaN 1525.87
7 sixty-min 2007-06-04 16:00:00 1539.00 1539.18 0.012 0.012 0.000 NaN 1526.39
8 sixty-min 2007-06-05 09:00:00 1539.12 1533.15 -0.389 0.000 -0.456 NaN 1526.66
9 sixty-min 2007-06-05 10:00:00 1533.16 1534.77 0.105 0.160 -0.178 NaN 1526.97
我想做的是将此数据框编译成只有一行的数据框。 它将具有以下列:
[ 'date' '60 9 open,' '60 9 close,' '60 9 change,' '60 9 high', '60 9 low', '60 9 200ema', '60 9 50ema',
'60 10 open', '60 10 close', '60 10 change', '60 10 high', '60 10 low', '60 10 200ema', '60 10 50ema',
'60 11 open', '60 11 close', '60 11 change', '60 11 high', '60 11 low', '60 11 200ema', '60 11 50ema',
'60 12 open', '60 12 close', '60 12 change', '60 12 high', '60 12 low', '60 12 200ema', '60 12 50ema',
'60 13 open', '60 13 close', '60 13 change', '60 13 high', '60 13 low', '60 13 200ema', '60 13 50ema',
'60 14 open', '60 14 close', '60 14 change', '60 14 high', '60 14 low', '60 14 200ema', '60 14 50ema',
'60 15 open', '60 15 close', '60 15 change', '60 15 high', '60 15 low', '60 15 200ema', '60 15 50ema',
'60 16 open', '60 16 close', '60 16 change', '60 16 high', '60 16 low', '60 16 200ema', '60 16 50ema',]
不同之处在于行上只有一个日期并且没有类型,并且有一个基于每个单元格的数据类型/时间的标题。
您可以先将小时提取为 int 并按日期分组:
df['time'] = pd.to_datetime(df['time']).dt.hour
df = df.groupby('date').agg(list)
然后对于每个日期连接(沿列/轴1)从每列创建的数据框。 最后连接(沿行/axis0)所有日期的数据框:
df_out = pd.concat([
pd.concat([pd.DataFrame([row[col]], index=[index],
columns=[f'60 {h} {col}' for h in row['time']])
for col in row.index[1:]], axis=1)
for index, row in df.iterrows()
])
输出:
60 9 open 60 10 open 60 11 open 60 12 open 60 13 open 60 14 open 60 15 open ... 60 10 50ema 60 11 50ema 60 12 50ema 60 13 50ema 60 14 50ema 60 15 50ema 60 16 50ema
2007-06-04 1536.28 1534.87 1534.88 1536.21 1537.31 1536.25 1536.53 ... 1523.37 1523.87 1524.39 1524.86 1525.33 1525.87 1526.39
2007-06-05 1539.12 1533.16 NaN NaN NaN NaN NaN ... 1526.97 NaN NaN NaN NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.