[英]Search for any word or combination of words from one string in a list (python)
[英]Python search exact word from list in string?
我需要從字符串中的列表中找到確切的單詞。
我試過下面的代碼。 在這里,我從列表中得到單個單詞的完全匹配,但是如何匹配列表中的兩個單詞。
categories_to_retain =
['SOLID',
'GEOMETRIC',
'FLORAL',
'BOTANICAL',
'STRIPES',
'ABSTRACT',
'ANIMAL',
'GRAPHIC PRINT',
'ORIENTAL',
'DAMASK',
'TEXT',
'CHEVRON',
'PLAID',
'PAISLEY',
'SPORTS']
x = " Beautiful Art By Design Studio **graphic print** Creates A **TEXT** Design For This Art Driven Duvet. Printed In Remarkable Detail On A Woven Duvet, This Is An Instant Focal Point Of Any Bedroom. The Fabric Is Woven Of Easy Care Polyester And Backed With A Soft Poly/Cotton Blend Fabric. The Texture In The Fabric Gives Dimension And A Unique Look And Feel To The Duvet."
x = x.upper()
print x
#x = "GRAPHIC"
#x = "GRAPHIC PRINTS"
matches = [cat for cat in categories_to_retain if cat in x.split()]
matches
Output:
['TEXT']
在這里你可以看到我的列表中有一個名為'GRAPHIC PRINT'的單詞。 我想從我的字符串中找到這個詞。
即使它以復數形式或過去時態存在,我也需要找到單詞。 例如,STRIPED,STRIPE,GRAPHIC PRINTS等。
謝謝,Niranjan
使用帶邊界的正則表達式來獲得完全匹配,即使您只有單個單詞,如果您試圖忽略任何標點符號,您的邏輯將無效:
import re
patts = re.compile("|".join(r"\b{}\b".format(s) for s in categories_to_retain), re.I)
x = " Beautiful Art By Design Studio **graphic print** Creates A **TEXT** Design For This Art Driven Duvet. Printed In Remarkable Detail On A Woven Duvet, This Is An Instant Focal Point Of Any Bedroom. The Fabric Is Woven Of Easy Care Polyester And Backed With A Soft Poly/Cotton Blend Fabric. The Texture In The Fabric Gives Dimension And A Unique Look And Feel To The Duvet."
print(patts.findall(x))
哪個會給你:
['graphic print', 'TEXT']
您可以使用正則表達式,這也有助於避免匹配字符序列,並將顯示確切的輸入字。
import re
matches = []
categories_to_retain = ['SOLID',
'GEOMETRIC',
'FLORAL',
'BOTANICAL',
'STRIPES',
'ABSTRACT',
'ANIMAL',
'GRAPHIC PRINT',
'ORIENTAL',
'DAMASK',
'TEXT',
'CHEVRON',
'PLAID',
'PAISLEY',
'SPORTS']
x = " Beautiful Art By Design Studio **graphic print** Creates A **TEXT** Design For This Art Driven Duvet. Printed In Remarkable Detail On A Woven Duvet, This Is An Instant Focal Point Of Any Bedroom. The Fabric Is Woven Of Easy Care Polyester And Backed With A Soft Poly/Cotton Blend Fabric. The Texture In The Fabric Gives Dimension And A Unique Look And Feel To The Duvet."
x = x.upper()
print(x)
def searchWholeWord(w):
return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search
for cat in categories_to_retain:
return_value = searchWholeWord(cat)(x)
if return_value:
matches.append(cat)
print(matches)
輸出:
['GRAPHIC PRINT', 'TEXT']
在這里,您使用默認的split()拆分字符串,這意味着它將在每個空格處拆分:x.split()中將有字符串“GRAPHIC”和“PRINT”,但不是“GRAPHIC PRINT”。 你可能想要使用“if cat in x”,我相信在這種情況下我會回復你需要的東西。
這應該工作:
matches = [cat for cat in categories_to_retain if cat in x]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.