簡體   English   中英

以數據框的形式返回行,其中包含特定列中的某個單詞

[英]return row in form of dataframe which contains some word in a particular column

我想在包含多個搜索條件的數據框中獲得這樣一行。 例如路徑描述 c:/data he is a good boy d:/data rabin is a good boy e:/data he is good f:/data he is a boy

我想以數據框的形式獲取所有行,其中包含 ('good','boy') retun 應該是 c:/data He is a good boy d:/data rabin is a good boy

讓我們一步一步來。

考慮你有如下數據框:

import pandas as pd
df = pd.DataFrame({
    'path': ['c:/data', 'd:/data', 'e:/data', 'f:/data'],
    'description': ['he is a good boy', 'rabin is a good boy', 'he is good', 'he is a boy']
})
df
    description         path
0   he is a good boy    c:/data
1   rabin is a good boy d:/data
2   he is good          e:/data
3   he is a boy         f:/data

這就是您獲得所需內容的方式:

df[df['description'].str.contains('good') & df['description'].str.contains('boy')]
    description         path
0   he is a good boy    c:/data
1   rabin is a good boy d:/data

我為你做的……沒有經過測試。

def find_intrastr_column_df(df, column_name, list_of_str_to_find):
    """
    Loop in a pandas dataframe, trying to find all rows 
    that contains the listed words in a particular column and 
    return a selected dataframe only with the rows founded.

    df: Pandas Dataframe
    column_name: String. Columns name
    list_of_str_to_find: List. List of String of words to find.
    """

    list_rows_find = []

    for index, row in df.iterrows():
        cell_value = row[column_name]
        # ignore nan values
        if cell_value != cell_value:
            break
        else:
            pass

        found = True
        for str in list_of_str_to_find:
            if str not in cell_value:
                found = False
                break
            else:
                pass

        if found:
            list_rows_find.append(index)
        else:
            pass

    df_selected = df.loc[list_rows_find,:]
    return df_selected

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM