[英]How to slice dataframe based on increment of value in a column?
frame = pd.Series([4051, 4052, 4053, 4054, 4060, 4061])
heat = pd.Series([52, 51, 50, 52, 53, 50])
df_1 = pd.DataFrame()
df_1['Frame'] = frame
df_1['Heat'] = heat
I have a dataframe df_1
.我有一个 dataframe
df_1
。 I want to retrieve a dataframe df_2
which only contains the rows of df_1
, whose increment of Frame
from one row to the next is smaller or equal to 3. If the increment is larger, the search shall stop.我想检索一个 dataframe
df_2
,它只包含df_1
的行,其Frame
从一行到下一行的增量小于或等于 3。如果增量较大,则搜索将停止。 I tried this:我试过这个:
i = 0
df_2 = pd.DataFrame()
for i in df_1['Frame']:
j = i+1
if (df_1['Frame'][j] - df_1['Frame'][i]) > 3:
break
else:
df_2.append(i)
It results in an error.它会导致错误。 Can you find my mistake?
你能找出我的错误吗? If possible, I would prefer a solution without a loop since loops tend to be slow.
如果可能的话,我更喜欢没有循环的解决方案,因为循环往往很慢。
My desired output would be:我想要的 output 是:
frame = pd.Series([4051, 4052, 4053, 4054])
heat = pd.Series([52, 51, 50, 52])
df_1 = pd.DataFrame()
df_1['Frame'] = frame
df_1['Heat'] = heat
Use Series.diff
with compare for greater and mask by Series.cummax
for filtering in boolean indexing
with invert mask by ~
for bitwise NOT
:在
boolean indexing
中使用Series.diff
进行比较以获得更大的值,并使用Series.cummax
进行掩码过滤,并使用~
的反转掩码进行按位NOT
:
df_1 = df_1[~df_1['Frame'].diff().gt(3).cummax()]
print (df_1)
Frame Heat
0 4051 52
1 4052 51
2 4053 50
3 4054 52
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.