Here is the input dataframe,
df_data = pd.DataFrame({'A':[2,1,3], 'content': ['the dog is sleeping', 'my name is Dude', 'i am who i am']})
and list of words,
words_list= ['dog', 'Dude','sleeping', 'i']
now, i know how to create a new column with indication if i have the word that i want, something like this -
df_data['new'] = df_data.apply(lambda row: True if any([item in row['content'] for item in words_list]) else False, axis = 1)
the point is that i want also to have count for the words... as example, in row number 2 and row number 3 i have 2 words from my list so i want to have a new column with the value 2, etc.
thank you!
try this, pandas.Series.str.findall to extract the matches.
import pandas as pd
import re
df_data = pd.DataFrame({'A':[2,1,3], 'content': ['the dog is sleeping', 'my name is Dude', 'i am who i am']})
words_list= ['dog', 'Dude','sleeping', 'i']
search_ = re.compile("\\b%s\\b" % "\\b|\\b".join(words_list))
df_data['matches'] = df_data.content.str.findall(search_)
df_data['count'] = df_data['matches'].apply(len)
A content matches count
0 2 the dog is sleeping [dog, sleeping] 2
1 1 my name is Dude [Dude] 1
2 3 i am who i am [i, i] 2
First, I think you need to modify your initial function as it may provide an incorrect output.
For example:
words_list= ['do']
df_data['new'] = df_data.apply(lambda row: True if any([item in row['content'] for item in words_list]) else False, axis = 1)
Results in
A content new
0 2 the dog is sleeping True
1 1 my name is Dude False
2 3 i am who i am False
Thought, there is no word 'do' in the first row. It can be fixed by splitting row content into list:
row['content'].split()
The count can be set simply with sum function on boolean array:
df_data['new'] = df_data.apply(lambda row: sum([item in row['content'].split() for item in words_list]), axis = 1)
Output:
A content new
0 2 the dog is sleeping 2
1 1 my name is Dude 1
2 3 i am who i am 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.