如何根据列值获取数据帧切片的最大值？

Question

I'm looking to make a new column, MaxPriceBetweenEntries based on the max() of a slice of the dataframe我希望根据数据帧切片的 max() 创建一个新列MaxPriceBetweenEntries

idx Price EntryBar ExitBar
0   10.00 0        1
1   11.00 NaN      NaN
2   10.15 2        4
3   12.14 NaN      NaN
4   10.30 NaN      NaN

turned into转换成

idx Price EntryBar ExitBar MaxPriceBetweenEntries
0   10.00 0        1       11.00
1   11.00 NaN      NaN     NaN
2   10.15 2        4       12.14
3   12.14 NaN      NaN     NaN
4   10.30 NaN      NaN     NaN

I can get all the rows with an EntryBar or ExitBar value with df.loc[df["EntryBar"].notnull()] and df.loc[df["ExitBar"].notnull()] , but I can't use that to set a new column:我可以使用df.loc[df["EntryBar"].notnull()]和df.loc[df["ExitBar"].notnull()]获取带有 EntryBar 或 ExitBar 值的所有行，但我不能使用它来设置一个新列：

df.loc[df["EntryBar"].notnull(),"MaxPriceBetweenEntries"] = df.loc[df["EntryBar"]:df["ExitBar"]]["Price"].max()

but that's effectively a guess at this point, because nothing I'm trying works.但这实际上是一个猜测，因为我尝试的任何方法都不起作用。 Ideally the solution wouldn't involve a loop directly because there may be millions of rows.理想情况下，解决方案不会直接涉及循环，因为可能有数百万行。

Answer 1

You can groupby the cumulative sum of non-null entries and take the max, unsing np.where() to only apply to non-null rows::您可以按非空条目的累积总和进行np.where() ，并取最大值，unsing np.where()仅适用于非空行：

df['MaxPriceBetweenEntries'] = np.where(df['EntryBar'].notnull(),
                                        df.groupby(df['EntryBar'].notnull().cumsum())['Price'].transform('max'),
                                        np.nan)
df
Out[1]: 
   idx  Price  EntryBar  ExitBar  MaxPriceBetweenEntries
0    0  10.00       0.0      1.0                   11.00
1    1  11.00       NaN      NaN                     NaN
2    2  10.15       2.0      4.0                   12.14
3    3  12.14       NaN      NaN                     NaN
4    4  10.30       NaN      NaN                     NaN

Answer 2

Let's try groupby() and where :让我们试试groupby()和where ：

s = df['EntryBar'].notna()
df['MaxPriceBetweenEntries'] = df.groupby(s.cumsum())['Price'].transform('max').where(s)

Output:输出：

   idx  Price  EntryBar  ExitBar  MaxPriceBetweenEntries
0    0  10.00       0.0      1.0                   11.00
1    1  11.00       NaN      NaN                     NaN
2    2  10.15       2.0      4.0                   12.14
3    3  12.14       NaN      NaN                     NaN
4    4  10.30       NaN      NaN                     NaN

Answer 3

You can forward fill the null values, group by entry and get the max of that groups Price.您可以向前填充空值，按条目分组并获得该组价格的最大值。 Use that as the right side of a left join and you should be in business.将其用作左连接的右侧，您应该可以开展业务。

df.merge(df.ffill().groupby('EntryBar')['Price'].max().reset_index(name='MaxPriceBetweenEntries'), 
                                                                   on='EntryBar', 
                                                                   how='left')

Answer 4

Try尝试

df.loc[df['ExitBar'].notna(),'Max']=df.groupby(df['ExitBar'].ffill()).Price.max().values
df
Out[74]: 
   idx  Price  EntryBar  ExitBar    Max
0    0  10.00       0.0      1.0  11.00
1    1  11.00       NaN      NaN    NaN
2    2  10.15       2.0      4.0  12.14
3    3  12.14       NaN      NaN    NaN
4    4  10.30       NaN      NaN    NaN

如何根据列值获取数据帧切片的最大值？

问题描述

4 个解决方案

解决方案1
2 2020-10-16 00:21:16

解决方案2
2 2020-10-16 00:24:25

解决方案3
1 已采纳 2020-10-16 00:17:57

解决方案4
1 2020-10-16 00:27:29

如何根据列值获取数据帧切片的最大值？

问题描述

4 个解决方案

解决方案1 2 2020-10-16 00:21:16

解决方案2 2 2020-10-16 00:24:25

解决方案3 1 已采纳 2020-10-16 00:17:57

解决方案4 1 2020-10-16 00:27:29

解决方案1
2 2020-10-16 00:21:16

解决方案2
2 2020-10-16 00:24:25

解决方案3
1 已采纳 2020-10-16 00:17:57

解决方案4
1 2020-10-16 00:27:29