简体   繁体   中英

How to replace multiple words with single word using dictionary in python?

I have a dictionary of keys and multiple values as below:

word_list = {'cool':['better','good','best','great'], 'car':['vehicle','moving','automobile','four-wheeler'], 'sound':['noise', 'disturbance', 'rattle']}
sentences = ['that day I heard a vehicle noise not a great four-wheeler', 'the automobile industry is doing good these days', 'that moving noise is better now']

As I have multiple values for a given key, if any of these values appear in the sentences, I want to replace them with its associated key.

I tried the following, but did not get the desired output.

results= [' '.join(word_list.get(y, y) for y in w.split()) for w in sentences]

Desired output:

['that day I heard a car sound not a cool car', 'the car industry is doing cool these days', 'that car sound is better now']

Not sure how to achieve this.

The trick is actually to create an inverted mapping where you set as key, each value of the replacement key, and as value the key.

Then after it's easy, as you just have to iterate on each word in each sentence and replace it with the value of that inverted mapping, if the word is one of the key of this mapping.

word_list = {
    'cool': ['better','good','best','great'],
    'car': ['vehicle','moving','automobile','four-wheeler'],
    'sound': ['noise', 'disturbance', 'rattle']
}
sentences = [
    'that day I heard a vehicle noise not a great four-wheeler',
    'the automobile industry is doing good these days',
    'that moving noise is better now'
]

swapped_word_list = {
    word: replacement
    for replacement, words in word_list.items()
    for word in words
}
new_sentences = [
    ' '.join([
        swapped_word_list.get(word, word)
        for word in sentence.split()
    ])
    for sentence in sentences
]

a solution using regex & reduce , because why not:

  • create a list of mapping for a regex pattern matching all the words to the replacement word
  • apply all the mappings to each string recursively using reduce

note the prefix rf before the string specifies that it is a raw f-string

from functools import reduce
import re

mappings = [
  {'pat': rf'\b({"|".join(words)})\b', 'rep': rep}
  for rep, words in word_list.items()
]

cleaned_sentences = [
  reduce(lambda s, m: re.sub(m['pat'], m['rep'], s), mappings, sentence)
  for sentence in sentences
]
for s in cleaned_sentences:
  print(s)
# outputs:
that day I heard a car sound not a cool car
the car industry is doing cool these days
that car sound is cool now

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM