How to check if a value in one column can contain more than value in another column

Question

I have the following dataframe:

df = pd.DataFrame()
df['id'] = [1, 1, 2, 2]
df['col1'] = [10, 10, 20, 20]
df['col2'] = [100, 200, 50, 50]
df['col3'] = [1, 2, 3, 4]

The goal

From this dataframe, I want to return the part of the dataframe where a value in col1 can have multiple values in col2 for a particular ID . In this case, id '1' has a value in col1 of 10, and 100 in col2. As id '1' also has a value of 10 in col1 in the second row, the value in col2 should also be 100. This is not the case for this id, however, it is the case for ID '2'. It should work both ways, so the values of col1 and col2 should just be consistent with each other for a ID. Column 3 contains other values that are not important for the matching, but should be included in the dataframe.

Desired output

The part of the dataframe where the values of the columns are not matching.

df = pd.DataFrame()
df['id'] = [1, 1]
df['col1'] = [10, 10]
df['col2'] = [100, 200]
df['col3'] = [1, 2]

Answer 1

You groupby and check the number of unique values for each value in col1 and if it is 1 you keep it:

df = df[(df.groupby(['id', 'col1'])['col2'].transform(lambda x: x.nunique()!=1))]
print(df)

id  col1  col2
2    20    50
2    20    50

How to check if a value in one column can contain more than value in another column

Question

The goal

Desired output

1 answers

solution1
1 ACCPTED 2020-04-08 12:25:41

How to check if a value in one column can contain more than value in another column

Question

The goal

Desired output

1 answers

solution1 1 ACCPTED 2020-04-08 12:25:41

solution1
1 ACCPTED 2020-04-08 12:25:41