[英]Only show specific groups in a df pandas
Hel lo, I need to focus on specific group within a table.您好,我需要关注表中的特定组。
Here is an exemple:这是一个例子:
groups col1
A 3
A 4
A 2
A 1
B 3
B 3
B 4
C 2
D 4
D 3
and I would like to only show groups that contain 3 and 4 but no other number.我只想显示包含 3 和 4 但没有其他数字的组。 Here I should get :
在这里我应该得到:
groups col1
B 3
B 3
B 4
D 4
D 3
Here are possible 2 approaches - test values by Series.isin
for membership and then get all groups with all True
s by GroupBy.transform
and GroupBy.all
, last filter by boolean indexing
:这里有两种可能的方法 - 通过
Series.isin
测试成员资格,然后通过GroupBy.transform
和GroupBy.all
获取所有具有所有True
的组,最后通过boolean indexing
过滤:
df1 = df[df['col1'].isin([3,4]).groupby(df['groups']).transform('all')]
print (df1)
groups col1
4 B 3
5 B 3
6 B 4
8 D 4
9 D 3
Another approach is first get all groups values, which NOT
contains values 3,4
and pass to another isin
function with inverted mask:另一种方法是首先获取所有组值,其中
NOT
包含值3,4
并传递给另一个具有反转掩码的isin
函数:
df1 = df[~df['groups'].isin(df.loc[~df['col1'].isin([3,4]), 'groups'])]
print (df1)
groups col1
4 B 3
5 B 3
6 B 4
8 D 4
9 D 3
We can also use GroupBy.filter
:我们也可以使用
GroupBy.filter
:
new_df=df.groupby('groups').filter(lambda x: x.col1.isin([3,4]).all() )
print(new_df)
groups col1
4 B 3
5 B 3
6 B 4
8 D 4
9 D 3
an alternative to remove Series.isin
from the lambda function:从 lambda 函数中删除
Series.isin
的替代方法:
df['aux']=df['col1'].isin([3,4])
df.groupby('groups').filter(lambda x: x.aux.all()).drop('aux',axis=1)
Using df.loc[]
and then searching by normal logic should work.使用
df.loc[]
然后按正常逻辑搜索应该可以工作。
import pandas as pd
data = [['A', 3],
['A', 4],
['A', 2],
['A', 1],
['B', 3],
['B', 3],
['B', 4],
['C', 2],
['D', 4],
['D', 3]]
df = pd.DataFrame(data, columns=["col1", "col2"])
df = df.loc[df["col2"] >= 3]
print(df.head())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.