简体   繁体   中英

Filter a dataframe column for a keyword, return seperate column value (name) from the row where each keyword is found

if a have a data frame and I want to return the values in one column if I find a keyword in another. So below if I search for apple I want the output to be [a,b]

like this:

names words
a     apple
b     apple
c     pear

I would want a list that is: [a,b]

I have found ways to return the boolean value using str.contains , but not sure how to take the value from another column in the same row which will give me the name. There must be a post I cant find if anyone can direct me there.

You could do

list(df[df['words'].str.contains('apple')]['names'])

resulting in

['a', 'b']
  1. df['words'].str.contains('apple') build a boolean pandas series for the condition
  2. the series resulting from previous line is used filter the original dataframe df
  3. in the dataframe resulting from previous line, the 'names' column is selected
  4. in the dataframe resulting from previous line, the column is cas to a list

Full code:

import io
import pandas as pd
data = """
names words
a     apple
b     apple
c     pear
"""
df = pd.read_csv(io.StringIO(data), sep='\s+')

lst = list(df[df['words'].str.contains('apple')]['names'])


>>>print(lst)

['a', 'b']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM