简体   繁体   English

如何根据列中值的增量对 dataframe 进行切片?

[英]How to slice dataframe based on increment of value in a column?

frame = pd.Series([4051, 4052, 4053, 4054, 4060, 4061])
heat = pd.Series([52, 51, 50, 52, 53, 50])
df_1 = pd.DataFrame()
df_1['Frame'] = frame
df_1['Heat'] = heat

I have a dataframe df_1 .我有一个 dataframe df_1 I want to retrieve a dataframe df_2 which only contains the rows of df_1 , whose increment of Frame from one row to the next is smaller or equal to 3. If the increment is larger, the search shall stop.我想检索一个 dataframe df_2 ,它只包含df_1的行,其Frame从一行到下一行的增量小于或等于 3。如果增量较大,则搜索将停止。 I tried this:我试过这个:

i = 0
df_2 = pd.DataFrame()

for i in df_1['Frame']:
    j = i+1
    if (df_1['Frame'][j] - df_1['Frame'][i]) > 3:
        break
    else: 
        df_2.append(i)

It results in an error.它会导致错误。 Can you find my mistake?你能找出我的错误吗? If possible, I would prefer a solution without a loop since loops tend to be slow.如果可能的话,我更喜欢没有循环的解决方案,因为循环往往很慢。

My desired output would be:我想要的 output 是:

frame = pd.Series([4051, 4052, 4053, 4054])
heat = pd.Series([52, 51, 50, 52])
df_1 = pd.DataFrame()
df_1['Frame'] = frame
df_1['Heat'] = heat

Use Series.diff with compare for greater and mask by Series.cummax for filtering in boolean indexing with invert mask by ~ for bitwise NOT :boolean indexing中使用Series.diff进行比较以获得更大的值,并使用Series.cummax进行掩码过滤,并使用~的反转掩码进行按位NOT

df_1 = df_1[~df_1['Frame'].diff().gt(3).cummax()]
print (df_1)
   Frame  Heat
0   4051    52
1   4052    51
2   4053    50
3   4054    52

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM