繁体   English   中英

从 pandas dataframe 的列中查找与另一个字符串列表中的任何项目匹配的字符串

[英]find a string from column in pandas dataframe which matches any item from another list of strings

我有一个 pandas 数据框 DF

 A                    
["I need PEN"   
["something went wrong in LAPTOP"      
"I eat MANGO"
"I dont know anything "]

和一个 Python 列表匹配["BAT","PEN","LAPTOP","I","SCHOOL",,,,]

需要添加一个与列表中的字符串匹配的新列 B

在此处输入图像描述

df['B']=df['A'].str.extract("(" + "|".join(matchers) + ")",expand=True)      

使用str.findall然后join

import pandas as pd
import re

df = pd.DataFrame({"A":["I need PEN",
                        "something went wrong in LAPTOP",
                        "I eat MANGO",
                        "I dont know anything about school"]})

matches = ["BAT","PEN","LAPTOP","I","SCHOOL"]
pattern = "|".join(f"\\b{i}\\b" for i in matches)

df["B"] = df['A'].str.findall(pattern,flags=re.IGNORECASE).str.join(",")

print (df)

#
                                   A         B
0                         I need PEN     I,PEN
1     something went wrong in LAPTOP    LAPTOP
2                        I eat MANGO         I
3  I dont know anything about school  I,school

只需使用df.apply function

def fn_apply(x):
    default_list = ["BAT","PEN","LAPTOP","I","SCHOOL"]
    b_list = []
    for item in default_list:
        if item.upper() in x.A.upper().split():
            b_list.append(item)
    return ",".join(b_list)

df['B'] = df.apply(fn_apply, axis=1)
df

    A                                   B
0   I need PEN                          PEN,I
1   something went wrong in LAPTOP      LAPTOP
2   eat MANGO   
3   dont know anythingabout school      SCHOOL

让我知道这是否适合您

with easy pattern
import re
df['B'] = df['A'].str.findall('(' + '|'.join(matches) + ')', flags=re.IGNORECASE).str.join(',')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM