[英]Reorganize pandas 'timed' dataframe into single row to allow for concat
I have dataframes (stored in excel files) of data for a single participant each of which look like我有一个参与者的数据数据框(存储在 excel 个文件中),每个参与者看起来像
df1 = pd.DataFrame([['15:05', '15:06', '15:07', '15:08'], [7.333879016553067, 8.066897471204006, 7.070168678977272, 6.501888904228463], [64.16712081101915, 65.08486717007806, 67.22483766233766, 64.40328265521458],
[114.21879259980525, 116.49792952572476, 113.26931818181818, 108.35424424108551]]).T
df1.columns = ['Start', 'CO', 'Dia', 'Sys']
Start![]() |
CO![]() |
Dia![]() |
Sys![]() |
|
---|---|---|---|---|
0 ![]() |
15:05 ![]() |
7.33388 ![]() |
64.1671 ![]() |
114.219 ![]() |
1 ![]() |
15:06 ![]() |
8.0669 ![]() |
65.0849 ![]() |
116.498 ![]() |
2 ![]() |
15:07 ![]() |
7.07017 ![]() |
67.2248 ![]() |
113.269 ![]() |
3 ![]() |
15:08 ![]() |
6.50189 ![]() |
64.4033 ![]() |
108.354 ![]() |
and I need to unstack
it into 1 row so that I can then read all the different participants into a single dataframe. I have tried using the answer to this question , and the answer to this question to get something like this (a multiindexed dataframe)我需要将它拆成一行,这样我就可以将所有不同的参与者读入一个
unstack
我已经尝试使用这个问题的答案,以及这个问题的答案来得到这样的东西(多索引数据框)
Time 1![]() |
Time 2![]() |
---|
CO![]() |
Dia![]() |
Sys![]() |
CO![]() |
Dia![]() |
Sys![]() |
|
---|---|---|---|---|---|---|
0 ![]() |
7.33388 ![]() |
64.1671 ![]() |
114.219 ![]() |
8.0669 ![]() |
65.0849 ![]() |
116.498 ![]() |
But what I'm ending up with is但我最终得到的是
('15:05', 'CO') ![]() |
('15:05', 'Dia') ![]() |
('15:05', 'Sys') ![]() |
('15:06', 'CO') ![]() |
('15:06', 'Dia') ![]() |
('15:06', 'Sys') ![]() |
|
---|---|---|---|---|---|---|
0 ![]() |
7.33388 ![]() |
64.1671 ![]() |
114.219 ![]() |
nan![]() |
nan![]() |
nan![]() |
1 ![]() |
nan![]() |
nan![]() |
nan![]() |
8.0669 ![]() |
65.0849 ![]() |
116.498 ![]() |
So as you can see, each minute is still a new row but now they are arranged in an even less useful way.因此,正如您所见,每一分钟仍然是一个新行,但现在它们的排列方式更加无用。
Can anyone offer advice?谁能提供建议?
Assuming that each row is Time 0
, Time 1
, etc. We can use the index for our top level in the MultiIndex假设每一行都是
Time 0
, Time 1
等。我们可以在 MultiIndex 中使用我们的顶级索引
# convert index to string and add "Time "
df1.index = "Time " + df1.index.astype(str)
Then groupby the index, take the max (or some other aggregate that keeps the original values) of all columns besides "Start" (0th element), stack, convert back to a frame, and transpose然后按索引分组,取除“开始”(第 0 个元素)之外所有列的最大值(或其他一些保留原始值的聚合),堆叠,转换回帧,然后转置
out = df1.groupby(df1.index)[df1.columns[1:]].max().stack().to_frame().T
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.