[英]Return the first word in a list contained in each row of a dataframe column
[英]return row in form of dataframe which contains some word in a particular column
我想在包含多個搜索條件的數據框中獲得這樣一行。 例如路徑描述 c:/data he is a good boy d:/data rabin is a good boy e:/data he is good f:/data he is a boy
我想以數據框的形式獲取所有行,其中包含 ('good','boy') retun 應該是 c:/data He is a good boy d:/data rabin is a good boy
讓我們一步一步來。
考慮你有如下數據框:
import pandas as pd
df = pd.DataFrame({
'path': ['c:/data', 'd:/data', 'e:/data', 'f:/data'],
'description': ['he is a good boy', 'rabin is a good boy', 'he is good', 'he is a boy']
})
df
description path
0 he is a good boy c:/data
1 rabin is a good boy d:/data
2 he is good e:/data
3 he is a boy f:/data
這就是您獲得所需內容的方式:
df[df['description'].str.contains('good') & df['description'].str.contains('boy')]
description path
0 he is a good boy c:/data
1 rabin is a good boy d:/data
我為你做的……沒有經過測試。
def find_intrastr_column_df(df, column_name, list_of_str_to_find):
"""
Loop in a pandas dataframe, trying to find all rows
that contains the listed words in a particular column and
return a selected dataframe only with the rows founded.
df: Pandas Dataframe
column_name: String. Columns name
list_of_str_to_find: List. List of String of words to find.
"""
list_rows_find = []
for index, row in df.iterrows():
cell_value = row[column_name]
# ignore nan values
if cell_value != cell_value:
break
else:
pass
found = True
for str in list_of_str_to_find:
if str not in cell_value:
found = False
break
else:
pass
if found:
list_rows_find.append(index)
else:
pass
df_selected = df.loc[list_rows_find,:]
return df_selected
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.