简体   繁体   English

如何使用正则表达式替换字符串中的数字

[英]How to replace numbers inside of a string using Regular Expression

I am quite new in regular expression so I am having confusion in replacing numbers inside an string. 我的正则表达式很新,所以在替换字符串中的数字时遇到混乱。

a="12ab34cde56" 

I want to replace it by 12abXXcde56 我想将其替换为12abXXcde56

b="abc1235ef"

I want to replace it by abcXXXXef 我想用abcXXXXef替换它

c="1ab12cd"

I want to replace it by 1abXXcd 我想将其替换为1abXXcd

I am trying those in python and in php, but with no luck. 我正在python和php中尝试这些,但是没有运气。 This is what I had in my mind: 这就是我的想法:

^([0-9]+)([a-z]+)(.*)([a-z]+)([0-9]+)$

You can use this regex to capture all digits that is not leading or trailing: 您可以使用此正则表达式来捕获不是前导或尾随的所有数字:

(?<!^|\d)\d+(?!$|\d)

Then in Python, you can supply a function that replace the match with corresponding number of X . 然后在Python中,您可以提供一个函数,用相应的X替换匹配项。

For PHP, you can enable PREG_OFFSET_CAPTURE to know the position of the match, and loop through the list of matches and process them. 对于PHP,您可以启用PREG_OFFSET_CAPTURE来知道匹配项的位置,并遍历匹配项列表并进行处理。

Note that with the regex above " 5 ddds" will be changed into " X ddds" 请注意,使用上述正则表达式将" 5 ddds"更改为" X ddds"

我们用X替换字符串s中用非数字( \\D )包围的每组数字( \\d+ )。

re.sub(r'(?<=\D)\d+(?=\D)',lambda match : 'X' * len(match.group(0)) , s)

The following pattern catch the string to remove in group 1: 以下模式捕获要在组1中删除的字符串:

^.*[a-z]+(\d+)[a-z]+.*$

Demo . 演示

import re
re1 = re.compile("([\d]*[a-zA-Z])([\d\w]+)([a-zA-Z][\d]*)")
re2 = re.compile("([\d])")

s = "4f6g6h7"
def x(matchobj):
    return ''.join([matchobj.groups()[0],
        re2.sub('X', matchobj.groups()[1]), matchobj.groups()[2]])

print re1.sub(x, s)

Update: The original method won't work for case "4f6g6h7" or any string only has one alphabet char between digit. 更新:原始方法不适用于大小写"4f6g6h7"或任何字符串在数字之间仅包含一个字母字符。

If using two regular expression instead of one is acceptable. 如果使用两个正则表达式而不是一个是可以接受的。 The following code should work for u. 以下代码适用于您。

import re
re1 = re.compile("([\d]*[a-zA-Z])([\d\w]+)([a-zA-Z][\d]*)")
re2 = re.compile("([\d])")

s = ['12ab34cde56', "abc1235ef","1ab12cd", "4f6g6h7"]

def x(matchobj):
    return ''.join([matchobj.groups()[0],
        re2.sub('X', matchobj.groups()[1]), matchobj.groups()[2]])

for ss in s:
    print ss, '->', re1.sub(x, ss)

>>>
12ab34cde56 -> 12abXXcde56
abc1235ef -> abcXXXXef
1ab12cd -> 1abXXcd
4f6g6h7 -> 4fXgXh7
>>> 

The only possibility with the stock re module appears to be a replace function, for example: 库存re模块的唯一可能性似乎是替换功能,例如:

xs = ["12ab34cde56", "abc1235ef", "1ab12cd"]

import re
for x in xs:
    print x, re.sub(r'(\D)(\d+)(\D)', lambda m: m.group(1) + 'X' * len(m.group(2)) + m.group(3), x)

With the more advanced regex module you can use variable-width lookaround assertions: 借助更高级的regex模块,您可以使用可变宽度环顾断言:

import regex
for x in xs:
    print x, regex.sub(r'(?<=\D\d*)\d(?=\d*\D)', 'X', x)    

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM