[英]Removing list of words and replace
請幫我。
到目前為止,我已經完成了步驟1(請參見下面的代碼)。 效果很好:
stopwords=['what','hello','and','at','is','am','i']
search_list=['where is north and northern side',
'ask in the community at western environmental',
'my name is alan and i am coming from london southeast']
dictionary = {'n': ['north','northern'],
's': ['south','southern'],
'e': ['east','eastern'],
'w': ['west','western'],
'env': ['environ.','enviornment','environmental']}
result = [' '.join(w for w in place.split() if w.lower() not in stopwords)
for place in search_list]
print (result)
我需要以下理想的最終輸出來完成步驟2 。 為了獲得所需的最終輸出,我應該在上面的代碼行中更改/包括哪些內容? 也歡迎任何其他替代方法。
['where n n side', 'ask in the community w env', 'my name alan coming from london s']
您必須“反轉”字典,因為查找是相反的:
rev_dict = {v:k for k,l in dictionary.items() for v in l}
現在方便更換:
>>> rev_dict
{'east': 'e',
'eastern': 'e',
'enviornment': 'env',
'environ.': 'env',
'environmental': 'env',
'north': 'n',
'northern': 'n',
'south': 's',
'southern': 's',
'west': 'w',
'western': 'w'}
再次分割您的字符串(如果沒有匹配,您可以保留單詞列表以避免分割)並替換為默認值作為單詞:
result = [" ".join([rev_dict.get(x,x) for x in s.split() if x not in stopwords]) for s in search_list]
或結合停用詞和替換詞:
stopwords={'what','hello','and','at','is','am','i'} # define as a set for fast lookup
result = [" ".join([rev_dict.get(x,x) for x in s.split() if x not in stopwords]) for s in search_list]
在這兩種情況下,結果:
['where n n side', 'ask in the community w env', 'my name alan coming from london southeast']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.