简体   繁体   English

用字典中的值替换列表中的部分元素 - Python

[英]replace a part of element in a list with a value from a dictionary - Python

I'm working on a project that converts French words into Cyrillic letters.我正在做一个将法语单词转换成西里尔字母的项目。

dict3 = {"ain":"(ен)",
         "oin":"(уен)"}

dict2 = {"on":"(он)",
        "in":"(ин)",
        "en":"(ен)",
        "eu":"(ё)",
        "an":"(ен)",
        "ou":"у",
        "oi":"уа",
        "au":"о",
        "ai":"э",
        "un":"(ен)",
        "ya":"я",
        "gn":"нь",
        "qu":"к",
        "ch":"ш"
         }

dict1 = {
    "a":"а",
    "b":"б",
    "c":"с",
    "ç":"с",
    "d":"д",
    "e":"",
    "é":"э",
    "è":"э",
    "ê":"э",
    "e ":"е",
    "f":"ф",
    "g":"г",
    "h": "",
    "i":"и",
    "j":"ж",
    "k":"к",
    "l":"л",
    "m":"м",
    "n":"н",
    "o":"о",
    "p":"п",
    "r":"р",
    "s":"с",
    "t":"т",
    "u":"(ю)",
    "v":"в",
    "w":"",
    "x":"кс",
    "y":"(и)",
    "z":"з"
    }
start=str(input())
start = start.split()
sortie = []
for i in range (len(start)):
    for x in range (len(dict3)):
        if dict3[x] in start[i]:
            sortie[i] = start[i][**first part**] + dict3[start[i]] + start[i][**next part**]

I firstly want to iterate through the dict3 , then dict2 , and dict1 , to get in the end only Cyrillic letters.我首先想遍历dict3 ,然后dict2dict1 ,最后只得到西里尔字母。 I'm trying to get it to modify only a part of a word when found that part in a dictionary.当在字典中找到该部分时,我试图让它只修改单词的一部分。 So how do I get to that?那么我该怎么做呢? Thanks谢谢

If you think texts and patterns: think regex .如果您认为文本和模式:认为regex

You can build regex patterns from your dicts and apply them - from python 3.6ish on it will be applied in order of key-creation in dict.您可以从您的字典构建正则表达式模式并应用它们 - 从 python 3.6ish 开始,它将按照字典中的键创建顺序应用。

Your charactersets are mostly distinct - but replacing "g" by "r" then replacing "r" by "p" will cause problems (see quatrevingtdizaine ) - unless you want this kind of "double" replacement.你的字符集大多是不同的——但是"g" by "r"然后用"r" by "p"会导致问题(参见quatrevingtdizaine )——除非你想要这种“双重”替换。 If not you need to fix that yourself (fe replacing g to some Unicodesymbol, then handling r then replacing that unicodesymbol back to r ).如果不是,您需要自己解决这个问题(将g替换为一些 Unicodesymbol,然后处理r然后将该 unicodesymbol 替换回r )。

Applied:应用:

three = {"ain":"(ен)", "oin":"(уен)"}

two = {"on":"(он)", "in":"(ин)", "en":"(ен)", "eu":"(ё)", "an":"(ен)", 
       "ou":"у", "oi":"уа", "au":"о", "ai":"э", "un":"(ен)", "ya":"я", 
       "gn":"нь", "qu":"к", "ch":"ш" }

one = {"a":"а", "b":"б", "c":"с", "ç":"с", "d":"д", "e":"", "é":"э", 
       "è":"э", "ê":"э", "e ":"е", "f":"ф", "g":"г", "h": "", "i":"и",
       "j":"ж", "k":"к", "l":"л", "m":"м", "n":"н", "o":"о", "p":"п", 
       "r":"р", "s":"с", "t":"т", "u":"(ю)", "v":"в", "w":"", "x":"кс", 
       "y":"(и)", "z":"з" }


words = "quatrevingtdizaine l'info au plus près de chez vous".split()

import re

result = []

for word in words:
    fr = word
    print(fr, end = "")
    for d in [three, two, one]:  # dicts in order
        for key, value in d.items():  # key,value pairs in order
            fr1 = fr
            fr = re.sub(key, value, fr)  # substitute stuff
            if fr1 != fr:
                print(" ->", fr, end = "")
    print()  
    result.append(fr)
    
print("", *result, sep="\n")

Output: Output:

"""
quatrevingtdizaine -> quatrevingtdiz(ен)e -> quatrev(ин)gtdiz(ен)e
-> кatrev(ин)gtdiz(ен)e -> каtrev(ин)gtdiz(ен)e -> каtrev(ин)gtдiz(ен)e 
-> каtrv(ин)gtдiz(ен) -> каtrv(ин)гtдiz(ен) -> каtrv(ин)гtдиz(ен) 
-> каtрv(ин)гtдиz(ен) -> катрv(ин)гтдиz(ен) -> катрв(ин)гтдиz(ен) 
-> катрв(ин)гтдиз(ен)
l'info -> l'(ин)fo -> l'(ин)фo -> л'(ин)фo -> л'(ин)фо
au -> о
plus -> pлus -> плus -> плuс -> пл(ю)с
près -> prэs -> пrэs -> прэs -> прэс
de -> дe -> д
chez -> шez -> шz -> шз
vous -> vуs -> vус -> вус

катрв(ин)гтдиз(ен)
л'(ин)фо
о
пл(ю)с
прэс
д
шз
вус
 
"""

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM