簡體   English   中英

在熊貓中如何從一列中的句子中提取特定單詞

[英]in pandas how to extract specific words from a sentence in a column

我有df1,我想從'desc'中的句子中提取'flavor',然后獲取df2。 我有一個口味列表,我根據該列表決定選擇哪個口味。 如何在python中獲得結果?

df1:
desc                                flavor
Coke 600mL and Chips                
Coke Zero 600mL and Chips           
390ml Coke + Small Fries            
600ml Coke + Regular Fries with     
Vanilla Coke 600mL and Chips        
Garlic Bread and pepsi 1.25ltr

df2:
 desc                               flavor
 Coke 600mL and Chips               Coke 
 Coke Zero 600mL and Chips          Coke Zero
 390ml Coke + SmallFries            Coke 
 600ml coke + Regular Fries with    Coke 
 Vanilla Coke 600mL and Chips       Vanilla Coke 
 Garlic Bread and pepsi 1.25ltr     Pepsi

> Flavor list: 
Coke 
Coke Zero 
Vanilla Coke 
Pepsi

如果只想按列表提取一個值,則使用str.extract

import re

L = ['Coke Zero', 'Vanilla Coke','Pepsi','Coke']
pat = '|'.join(r"\b{}\b".format(x) for x in L)

df['flavor'] = df['desc'].str.extract('('+ pat + ')', expand=False, flags=re.I)
print (df)
                              desc        flavor
0             Coke 600mL and Chips          Coke
1        Coke Zero 600mL and Chips     Coke Zero
2         390ml Coke + Small Fries          Coke
3  600ml Coke + Regular Fries with          Coke
4     Vanilla Coke 600mL and Chips  Vanilla Coke
5   Garlic Bread and pepsi 1.25ltr         pepsi

如果可能的話多flavours使用str.findall的列表,然后str.join

df['flavor'] = df['desc'].str.findall(pat, flags=re.I).str.join(' ')

print (df)
                              desc        flavor
0             Coke 600mL and Chips          Coke
1        Coke Zero 600mL and Chips     Coke Zero
2         390ml Coke + Small Fries          Coke
3  600ml Coke + Regular Fries with          Coke
4     Vanilla Coke 600mL and Chips  Vanilla Coke
5   Garlic Bread and pepsi 1.25ltr         pepsi

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM