Hel lo, I need to focus on specific group within a table.
Here is an exemple:
groups col1
A 3
A 4
A 2
A 1
B 3
B 3
B 4
C 2
D 4
D 3
and I would like to only show groups that contain 3 and 4 but no other number. Here I should get :
groups col1
B 3
B 3
B 4
D 4
D 3
Here are possible 2 approaches - test values by Series.isin
for membership and then get all groups with all True
s by GroupBy.transform
and GroupBy.all
, last filter by boolean indexing
:
df1 = df[df['col1'].isin([3,4]).groupby(df['groups']).transform('all')]
print (df1)
groups col1
4 B 3
5 B 3
6 B 4
8 D 4
9 D 3
Another approach is first get all groups values, which NOT
contains values 3,4
and pass to another isin
function with inverted mask:
df1 = df[~df['groups'].isin(df.loc[~df['col1'].isin([3,4]), 'groups'])]
print (df1)
groups col1
4 B 3
5 B 3
6 B 4
8 D 4
9 D 3
We can also use GroupBy.filter
:
new_df=df.groupby('groups').filter(lambda x: x.col1.isin([3,4]).all() )
print(new_df)
groups col1
4 B 3
5 B 3
6 B 4
8 D 4
9 D 3
an alternative to remove Series.isin
from the lambda function:
df['aux']=df['col1'].isin([3,4])
df.groupby('groups').filter(lambda x: x.aux.all()).drop('aux',axis=1)
Using df.loc[]
and then searching by normal logic should work.
import pandas as pd
data = [['A', 3],
['A', 4],
['A', 2],
['A', 1],
['B', 3],
['B', 3],
['B', 4],
['C', 2],
['D', 4],
['D', 3]]
df = pd.DataFrame(data, columns=["col1", "col2"])
df = df.loc[df["col2"] >= 3]
print(df.head())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.