简体   繁体   中英

Pandas select rows where query is in column of tuples

I have a dataframe in which one column contains tuples:

df = pd.DataFrame({'a':[1,2, 3], 'b':[(1,2), (3,4), (0,4)]})

   a       b
0  1  (1, 2)
1  2  (3, 4)
2  3  (0, 4)

I would like to select the rows where an element I provide is in the tuple.

For example, return rows where 4 is in a tuple, expect outcome would be:

   a       b
1  2  (3, 4)
2  3  (0, 4)

I have tried:

print(df[df['b'].isin([4])]

But this returns an empty dataframe:

Empty DataFrame
Columns: [a, b]
Index: []

You need apply with in :

print(df[df['b'].apply(lambda x: 4 in x)])
   a       b
1  2  (3, 4)
2  3  (0, 4)

You can first convert tuples to sets and then find sets intersections:

In [27]: df[df['b'].map(set) & {4}]
Out[27]:
   a       b
1  2  (3, 4)
2  3  (0, 4)

it'll also work for multiple values - for example if you are looking for all rows where either 1 or 3 is in a tuple :

In [29]: df[df['b'].map(set) & {1, 3}]
Out[29]:
   a       b
0  1  (1, 2)
1  2  (3, 4)

Explanation:

In [30]: df['b'].map(set)
Out[30]:
0    {1, 2}
1    {3, 4}
2    {0, 4}
Name: b, dtype: object

In [31]: df['b'].map(set) & {1, 3}
Out[31]:
0     True
1     True
2    False
Name: b, dtype: bool

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM