简体   繁体   中英

filter pandas.DataFrame with python list in cell

I have pasdas.DataFrame like this:

import pandas as pd

data = {'name' : ['Alice', 'Bob', 'Eve'],
        'age' : ['20', '35', '40'],
        'stuff' : [['computer', 'phone', 'bike'], ['bike', 'skateboard', 'phone'], 
                   ['computer', 'phone', 'skateboard']]}

frame = pd.DataFrame(data)

How can I select rows where age > 30 and stuff contains 'computer'?

I've tried solve this with DataFrame.loc :

filteredFrame = frame.loc[(frame.age > 30)&('computer' in frame.stuff)]

But it doesn't work

First convert column year to numbers, if necessary:

frame.age = frame.age.astype(int)

Use Series.map or Series.apply :

filteredFrame = frame.loc[(frame.age > 30)&(frame.stuff.map(lambda x: 'computer' in x))]
filteredFrame = frame.loc[(frame.age > 30)&(frame.stuff.apply(lambda x: 'computer' in x))]

Or list comprehension:

filteredFrame = frame.loc[(frame.age > 30)&(['computer' in x for x in frame.stuff])]

print (filteredFrame)
  name  age                          stuff
2  Eve   40  [computer, phone, skateboard]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM