简体   繁体   中英

replace a part of element in a list with a value from a dictionary - Python

I'm working on a project that converts French words into Cyrillic letters.

dict3 = {"ain":"(ен)",
         "oin":"(уен)"}

dict2 = {"on":"(он)",
        "in":"(ин)",
        "en":"(ен)",
        "eu":"(ё)",
        "an":"(ен)",
        "ou":"у",
        "oi":"уа",
        "au":"о",
        "ai":"э",
        "un":"(ен)",
        "ya":"я",
        "gn":"нь",
        "qu":"к",
        "ch":"ш"
         }

dict1 = {
    "a":"а",
    "b":"б",
    "c":"с",
    "ç":"с",
    "d":"д",
    "e":"",
    "é":"э",
    "è":"э",
    "ê":"э",
    "e ":"е",
    "f":"ф",
    "g":"г",
    "h": "",
    "i":"и",
    "j":"ж",
    "k":"к",
    "l":"л",
    "m":"м",
    "n":"н",
    "o":"о",
    "p":"п",
    "r":"р",
    "s":"с",
    "t":"т",
    "u":"(ю)",
    "v":"в",
    "w":"",
    "x":"кс",
    "y":"(и)",
    "z":"з"
    }
start=str(input())
start = start.split()
sortie = []
for i in range (len(start)):
    for x in range (len(dict3)):
        if dict3[x] in start[i]:
            sortie[i] = start[i][**first part**] + dict3[start[i]] + start[i][**next part**]

I firstly want to iterate through the dict3 , then dict2 , and dict1 , to get in the end only Cyrillic letters. I'm trying to get it to modify only a part of a word when found that part in a dictionary. So how do I get to that? Thanks

If you think texts and patterns: think regex .

You can build regex patterns from your dicts and apply them - from python 3.6ish on it will be applied in order of key-creation in dict.

Your charactersets are mostly distinct - but replacing "g" by "r" then replacing "r" by "p" will cause problems (see quatrevingtdizaine ) - unless you want this kind of "double" replacement. If not you need to fix that yourself (fe replacing g to some Unicodesymbol, then handling r then replacing that unicodesymbol back to r ).

Applied:

three = {"ain":"(ен)", "oin":"(уен)"}

two = {"on":"(он)", "in":"(ин)", "en":"(ен)", "eu":"(ё)", "an":"(ен)", 
       "ou":"у", "oi":"уа", "au":"о", "ai":"э", "un":"(ен)", "ya":"я", 
       "gn":"нь", "qu":"к", "ch":"ш" }

one = {"a":"а", "b":"б", "c":"с", "ç":"с", "d":"д", "e":"", "é":"э", 
       "è":"э", "ê":"э", "e ":"е", "f":"ф", "g":"г", "h": "", "i":"и",
       "j":"ж", "k":"к", "l":"л", "m":"м", "n":"н", "o":"о", "p":"п", 
       "r":"р", "s":"с", "t":"т", "u":"(ю)", "v":"в", "w":"", "x":"кс", 
       "y":"(и)", "z":"з" }


words = "quatrevingtdizaine l'info au plus près de chez vous".split()

import re

result = []

for word in words:
    fr = word
    print(fr, end = "")
    for d in [three, two, one]:  # dicts in order
        for key, value in d.items():  # key,value pairs in order
            fr1 = fr
            fr = re.sub(key, value, fr)  # substitute stuff
            if fr1 != fr:
                print(" ->", fr, end = "")
    print()  
    result.append(fr)
    
print("", *result, sep="\n")

Output:

"""
quatrevingtdizaine -> quatrevingtdiz(ен)e -> quatrev(ин)gtdiz(ен)e
-> кatrev(ин)gtdiz(ен)e -> каtrev(ин)gtdiz(ен)e -> каtrev(ин)gtдiz(ен)e 
-> каtrv(ин)gtдiz(ен) -> каtrv(ин)гtдiz(ен) -> каtrv(ин)гtдиz(ен) 
-> каtрv(ин)гtдиz(ен) -> катрv(ин)гтдиz(ен) -> катрв(ин)гтдиz(ен) 
-> катрв(ин)гтдиз(ен)
l'info -> l'(ин)fo -> l'(ин)фo -> л'(ин)фo -> л'(ин)фо
au -> о
plus -> pлus -> плus -> плuс -> пл(ю)с
près -> prэs -> пrэs -> прэs -> прэс
de -> дe -> д
chez -> шez -> шz -> шз
vous -> vуs -> vус -> вус

катрв(ин)гтдиз(ен)
л'(ин)фо
о
пл(ю)с
прэс
д
шз
вус
 
"""

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM