简体   繁体   English

用多个方程过滤 python 中的 dataframe

[英]Filter dataframe in python with multiple equations

I have dataframe df that looks like this我有 dataframe df看起来像这样

    Lvl    Distance    iMap     Grp
0   37     63          A3       1
1   37     59          A9       1
2   37     54          A3       2
3   37     48          A4       2
...
190 37     27          A3       1
191 37     20          A3       4

I have 2 filters that I am trying to combine with "OR"我有 2 个过滤器,我试图将它们与“OR”结合使用

The first is m1第一个是m1

m1 = df[(df["Distance"]<55)].groupby('Grp').cumcount().eq(0)

you notice that m1 index starts with 2 (not 0)您注意到m1索引以 2 开头(不是 0)

>>> m1
2      True
3      False
4      False
       ...
187    True
188    False
189    False
190    False
191    False

also the 2nd filter也是第二个过滤器

m2 = df['Distance'].gt(55)

you notice that m2 index starts with 0您注意到m2索引以 0 开头

>>> m2
0      True
1      True
2      False
3      False
4      False
       ...
187    False
188    False
189    False
190    False
191    False

When I try to combine both filters当我尝试结合两个过滤器时

df[m1 | m2]

and results和结果

    Lvl    Distance    iMap     Grp
2   37     54          A3       2
19  37     41          A4       3
74  37     36          A3       1
187 37     29          A3       4

you can see that the first 2 records were not selected although their value is True in m2您可以看到前 2 条记录没有被选中,尽管它们的值在m2中为 True

but that value does not exist in m1但该值在m1中不存在

any idea how to fix this?知道如何解决这个问题吗? so if any index is True it shows因此,如果任何索引为 True 它显示

IMO, m1 construction is incomplete. IMO, m1构造不完整。 You need to create a mask that has the same length as df .您需要创建一个与df长度相同的掩码。 To do that, one method is, via the isin() method, check if the indexes of the rows flagged as True in m1 exist in df.index .为此,一种方法是通过isin()方法检查m1中标记为 True 的行的索引是否存在于df.index中。 That way, you can make the True values in m1 correspond to df.index .这样,您可以使m1中的 True 值对应于df.index

# first create groups as before
m1 = df[df['Distance']<55].groupby('Grp').cumcount().eq(0)
# filter for the index of the values flagged True in `m1` and 
# flag the rows of these indexes as True
m1 = df.index.isin(m1.index[m1])

# m2 construction as before
m2 = df['Distance'].gt(55)

# filter
df[m1|m2]

For the given input, the above code produces the following dataframe.对于给定的输入,上述代码生成以下 dataframe。 资源

You can reindex m1 to make it conform to df , like so您可以重新索引m1以使其符合df ,就像这样

m1 = df[(df["Distance"]<55)].groupby('Grp').cumcount().eq(0).reindex(df.index, fill_values=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM