Drop rows within dataframe based on condition pandas python

Question

I have a dataframe like this

import pandas as pd
data = {'Index Title'  : ["Company1", "Company1", "Company2", "Company3"],
    'BusinessType'     : ['Type 1', 'Type 2', 'Type 1', 'Type 2'],
    'ID1'     : ['123', '456', '789', '012'] 
        }
df = pd.DataFrame(data)
df.index = df["Index Title"]
del df["Index Title"]
print(df)

Dataframe

where Index Title is a company name. For Company 1 I have two types - Type 1 and Type 2.

For Company 2 I have only Type 1 And for Company 3 I have only Type 2.

I would like to drop those rows where there is only one type - Type 1 or Type 2.

So in this case it should drop Company 2 and Company 3.

Could you please help me what is the best way to do that?

Answer 1

For such problems we usually consider groupby and transform based filtering as it is pretty fast.

df[df.groupby(level=0)['BusinessType'].transform('nunique') > 1]

            BusinessType  ID1
Index Title                  
Company1          Type 1  123
Company1          Type 2  456

The first step is to determine the groups/rows which are associated with more than one type:

df.groupby(level=0)['BusinessType'].transform('nunique')

Index Title
Company1    2
Company1    2
Company2    1
Company3    1
Name: BusinessType, dtype: int64

From here, we remove all companies whose # unique types associated with are == 1.

Answer 2

This is one way: - you group by Index Title - filter if there is at least one Type 1 & one Type 2

df = (
    df.groupby('Index Title')
    .filter(lambda x: (x['BusinessType']=='Type 1').any() & 
                      (x['BusinessType']=='Type 2').any())
    .reset_index()
)

Update if you are looking for two or more types regardless if they are Type 1 & Type 2

df = (
    df.groupby('Index Title')
    .filter(lambda x: x['BusinessType'].nunique() > 1)
    .reset_index()
)

In this case @cs95 's answer is the cleaner one, which you should use.

Drop rows within dataframe based on condition pandas python

Question

2 answers

solution1
2 ACCPTED 2020-05-31 20:11:32

solution2
1 2020-05-31 20:09:11

Drop rows within dataframe based on condition pandas python

Question

2 answers

solution1 2 ACCPTED 2020-05-31 20:11:32

solution2 1 2020-05-31 20:09:11

solution1
2 ACCPTED 2020-05-31 20:11:32

solution2
1 2020-05-31 20:09:11