![](/img/trans.png)
[英]Replace multiple string in file with values from Dictionary using Python
[英]How to replace a string using a dictionary containing multiple values for a key in python
我有包含Word及其最接近的相關單詞的字典。
我想用原始單詞替換字符串中的相關單詞。 目前,我可以替換每個鍵僅具有值的字符串中的單詞,而我不能替換具有多個值的Key的字符串。 如何才能做到這一點
輸入示例
North Indian Restaurant
South India Hotel
Mexican Restrant
Italian Hotpot
Cafe Bar
Irish Pub
Maggiee Baar
Jacky Craft Beer
Bristo 1889
Bristo 188
Bristo 188.
如何制作字典
y= list(word)
words = y
similar = [[item[0] for item in model.wv.most_similar(word) if item[1] > 0.7] for word in words]
similarity_matrix = pd.DataFrame({'Orginal_Word': words, 'Related_Words': similar})
similarity_matrix = similarity_matrix[['Orginal_Word', 'Related_Words']]
帶有列表的數據框內的2列
Orginal_Word Related_Words
[Indian] [India,Ind,ind.]
[Restaurant] [Hotel,Restrant,Hotpot]
[Pub] [Bar,Baar, Beer]
[1888] [188, 188., 18]
字典
similarity_matrix.set_index('Orginal_Word')['Related_Words'].to_dict()
{'Indian ': 'India, Ind, ind.',
'Restaurant': 'Hotel, Restrant, Hotpot',
'Pub': 'Bar, Baar, Beer'
'1888': '188, 188., 18'}
預期產量
North Indian Restaurant
South India Restaurant
Mexican Restaurant
Italian Restaurant
Cafe Pub
Irish Pub
Maggiee Pub
Jacky Craft Pub
Bristo 1888
Bristo 1888
Bristo 1888
任何幫助表示贊賞
我認為您可以從此答案中用regex
replace
為新字典:
d = {'Indian': 'India, Ind, ind.',
'Restaurant': 'Hotel, Restrant, Hotpot',
'Pub': 'Bar, Baar, Beer',
'1888': '188, 188., 18'}
d1 = {r'(?<!\S)'+ k.strip() + r'(?!\S)':k1 for k1, v1 in d.items() for k in v1.split(',')}
df['col'] = df['col'].replace(d1, regex=True)
print (df)
col
0 North Indian Restaurant
1 South Indian Restaurant
2 Mexican Restaurant
3 Italian Restaurant
4 Cafe Pub
5 Irish Pub
6 Maggiee Pub
7 Jacky Craft Pub
8 Bristo 1888
9 Bristo 1888
10 Bristo 1888
編輯(上述代碼的功能):
def replace_words(d, col):
d1={r'(?<!\S)'+ k.strip() + r'(?!\S)':k1 for k1, v1 in d.items() for k in v1.split(',')}
df[col] = df[col].replace(d1, regex=True)
return df[col]
df['col'] = replace_words(d, 'col')
編輯1:
如果出現以下錯誤:
正則表達式錯誤-缺少),位置7處未終止的子模式
鍵中必需的轉義正則表達式值:
import re
def replace_words(d, col):
d1={r'(?<!\S)'+ re.escape(k.strip()) + r'(?!\S)':k1 for k1, v1 in d.items() for k in v1.split(',')}
df[col] = df[col].replace(d1, regex=True)
return df[col]
df['col'] = replace_words(d, 'col')
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.