简体   繁体   English

python 从左到右替换字符串中的字符

[英]python replace characters in string from left to right

Ok so I found a similar question to this but it focused on splitting the string into pairs of two characters,好的,所以我发现了一个与此类似的问题,但它侧重于将字符串拆分为两个字符对,

The thing is thought That I want to be able to factor in multiple possibilities for replacements strings that are 2 characters long and 4 characters long.事情被认为我希望能够考虑 2 个字符长和 4 个字符长的替换字符串的多种可能性。 so without splitting the string and keeping it intact I would like to be able to scan the string from left to right and upon finding any "matches" replace this and then continue on scanning.因此,在不拆分字符串并保持其完整的情况下,我希望能够从左到右扫描字符串,并在找到任何“匹配项”后替换它,然后继续扫描。 along with prioratising the longer replacement sets first.以及首先优先考虑较长的替换集。 "0000" becomes "e" and not "aa" or "00 00" “0000”变成“e”而不是“aa”或“00 00”

the usual.replace() function re-scans the string for each different value, I want to avoid this. usual.replace() function 为每个不同的值重新扫描字符串,我想避免这种情况。

this is my script:这是我的脚本:

s = "0000000110110110100111111111"

x = s.replace("00","a").replace("11","b").replace("01","c").replace("10","d").replace("0000","e").replace("1111","f").replace("0101","g").replace("1010","h")

print(x)

My script so far produces: aaa0b0b0bcabbbb1到目前为止,我的脚本生成:aaa0b0b0bcabbbb1

But I would like to get the result: eacdbchcff但我想得到结果:eacdbchcff

based on the replacement possibilities of: 0000 00 01 10 11 01 1010 01 1111 1111基于以下替换可能性:0000 00 01 10 11 01 1010 01 1111 1111

You could put the translations into a dict, and also combine the search-patterns into a single regular expression, which gives priority to the longer patterns.您可以将翻译放入一个字典中,并将搜索模式组合成一个正则表达式,该正则表达式优先考虑较长的模式。 Then use the callback argument that re.sub accepts to make the replacement using the dict.然后使用re.sub接受的回调参数使用字典进行替换。

import re

trans = {
    "00": "a",
    "11": "b",
    "01": "c",
    "10": "d",
    "0000": "e",
    "1111": "f",
    "0101": "g",
    "1010": "h"
}

regex = "|".join(sorted(trans.keys(), key=len, reverse=True))

# demo
s =  "0000000110110110100111111111"
result = re.sub(regex, lambda x: trans[x.group(0)], s)
print(result)  # eacdbchcff

Non-regex approach would be to assess each section as a set of 4 characters, see if theres a match for those, or split into two halves of the 4 and get a match for them...非正则表达式方法是将每个部分评估为一组 4 个字符,查看是否匹配这些字符,或者将 4 个字符分成两半并为它们匹配......

replacements = {'0000': 'e', '1111': 'f', '1010': 'h', '0101': 'g', '10': 'd', '01': 'c', '11': 'b', '00': 'a'}
s = "0000000110110110100111111111"
r_d = replacement_dict  # only here to shorten comprehension

for i in range(0, len(s), 4):
     print(r_d.get(s[i:i+4], r_d.get(s[i:i+2], "") +r_d.get(s[i+2:i+4],"")), end="")

or with loop as a list comprehension或者用循环作为列表理解

"".join(r_d.get(s[i:i+4], r_d.get(s[i:i+2], "") +r_d.get(s[i+2:i+4], "")) for i in range(0, len(s), 4))
'eacdbcddcff'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM