[英]Spacy Phrase Matcher space sensitive issue
terms = ["Barack Obama", "Angela Merkel", "Washington, D.C."]
doc = nlp("German Chancellor Angela Merkel and US President Barack Obama "
"converse in the Oval Office inside the White House in Washington, D.C.")
如果我在“Barack Obama”這兩個詞之間輸入一個額外的空格,則短語匹配器將不起作用,因為它對空格敏感。 有沒有辦法克服這個空間敏感問題?
import re
re.sub(' +',' ', "barack obama")
#op
'barack obama'
參考文檔https://spacy.io/api/phrasematcher
import en_core_web_sm
nlp = en_core_web_sm.load()
matcher = PhraseMatcher(nlp.vocab)
matcher.add("OBAMA", None, nlp("Barack Obama"))
doc = nlp("Barack Obama urges Congress to find courage to defend his healthcare reforms")
matches = matcher(doc)
#op
[(7732777389095836264, 0, 2)]
但是當字符串之間有多個空格時,它將返回空列表。 即巴拉克奧巴馬之間有多個空格
doc = nlp("Barack Obama urges Congress to find courage to defend his
healthcare reforms")
print(matcher(doc))
#op
[]
為了解決這個問題,我想從給定的字符串中刪除額外的空間
string_= 'Barack Obama urges Congress to find courage to defend his healthcare reforms'
space_removed_string = re.sub(' +',' ', string_)
#now passing the string in model
doc = nlp(space_removed_string)
print(matcher(doc))
#op
[(7732777389095836264, 0, 2)]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.