簡體   English   中英

python dataframe 與列表項匹配

[英]python dataframe matching with list items

有一個包含column (A)的 dataframe df ,如下A

["User mapping missing constantly for random users on product PLA-ZA. Hi , Generally, these results look encouraging. Here's what I see at the moment:- On Sept 12, when you ran this script, exactly the same users were shown as not "in sync" as we see in the most recent output."

"History ------------ *** Audit: MLACAMBR 01/10/2018 18:40:05 GMT ",

 "1. find the process ids by doing ps -efl | grep BEMS74397" , 

"kill each of these processes, as follows (for example, if the process ID is 555555):", 

"Troubleshoots from the KMCas well as from the sensor where the connections occurred"]

和 Python 列表匹配

["PLA-ZA","BEMS","MLACAMBR","KMC","OWL",,,,]

Dataframe需要添加一個與列表中的字符串匹配的新column B

將匹配關鍵字添加為新列的新 Dataframe

在此處輸入圖像描述

matches= "|".join(f"\\b{i}\\b" for i in matches)
df["B"] = df['text'].str.findall(matches,re.IGNORECASE).str.join("|")
df["B"]

嘗試這個:

匹配前df的內容:(顯示輸入df)

print(df)
                                             ColumnA
0  User mapping missing constantly for random use...        
1  History ------------ *** Audit: MLACAMBR 01/10...        
2  1. find the process ids by doing ps -efl | gre...        
3  kill each of these processes, as follows (for ...        
4  Troubleshoots from the KMCas well as from the ...        

編碼:

df['ColumnB'] = ''
matches = ["PLA-ZA","BEMS","MLACAMBR","KMC","OWL"]

for index,row in df.iterrows():
    current_string = row['ColumnA']

    #This block of code extracts all matches that are present in ColumnA of every row in df
    tmpmatch = []
    for x in matches:
        if x in current_string and x not in tmpmatch:
            tmpmatch.append(x)
        tmp_matched_str = ','.join(tmpmatch)
    #Append  the matched string from matches into ColumnB of df
    df.loc[index,'ColumnB'] = tmp_matched_str

Output:

print(df)
                                             ColumnA   ColumnB
0  User mapping missing constantly for random use...    PLA-ZA
1  History ------------ *** Audit: MLACAMBR 01/10...  MLACAMBR
2  1. find the process ids by doing ps -efl | gre...      BEMS
3  kill each of these processes, as follows (for ...          
4  Troubleshoots from the KMCas well as from the ...       KMC

這是預期的輸出!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM