Python Regex使用通配符匹配字符串的开头并替换整个字符串

Question

I'm trying to match the beginning of a word and then replace the entire word with something. 我试图匹配一个单词的开头，然后用一些东西替换整个单词。 Below is what I'm trying to do. 以下是我正在尝试做的事情。

add23khh234 > REMOVED
add2asdf675 > REMOVED

Below is the regex statement I'm using. 下面是我正在使用的正则表达式语句。

string_reg = re.sub(ur'add*', 'REMOVED', string_reg)

But this code gives me the following. 但是这段代码给了我以下内容。

add23khh234 > REMOVED23khh234
add2asdf675 > REMOVED2asdf675

Answer 1

add* is ad '*d' . add*是ad '*d' 。 From the document : 从文件：

'*'

Causes the resulting RE to match 0 or more repetitions of the preceding RE, as many repetitions as are possible. 使得到的RE匹配前面RE的0或更多次重复，尽可能多的重复。 ab* will match a , ab , or a followed by any number of b s. ab*将匹配a ， ab或a后跟任意数量的b s。

So it matchs ad or add or adddddd... . 所以它匹配ad或add或add adddddd... But it doesn't match neither add23khh234 nor add2asdf675 (or something like these). 但它既不匹配add23khh234也不匹配add2asdf675 （或类似的东西）。

You should use .+? 你应该使用.+? or .*? 或.*? here(not .* , that's greedy). 在这里（不是.* ，那是贪婪的）。 Try something like this: 尝试这样的事情：

string_reg = re.sub(ur'add.+? ', 'REMOVED ', string_reg)

Demo: 演示：

>>> import re
>>> string_reg = """\
... add23khh234 > REMOVED23khh234
... add2asdf675 > REMOVED2asdf675"""

>>> string_reg = re.sub(ur'add.+? ', 'REMOVED ', string_reg)
>>> print string_reg
REMOVED > REMOVED23khh234
REMOVED > REMOVED2asdf675
>>>

Answer 2

尝试这个

string_reg = re.sub(ur'^add.*', 'REMOVED', string_reg)

Answer 3

如果你在一行上有多个模式

string_reg=re.sub("add[^ ]+","REMOVED",string_reg)

Answer 4

Short answer 简短的回答

\badd\w*

A quantifier such as * is applied to the previous token or subpattern. 诸如*的量词应用于先前的标记或子模式。 for example, the regex you're using add* matches a literal ad followed by any number of subsequent d . 例如，您正在使用的正则表达式add*匹配文字ad后跟任意数量的后续d 。

Meeting your criteria 符合您的标准

You need to match add at the beggining of a word, so use a word boundary \\b 您需要在单词的开始处匹配add ，因此请使用单词边界 \\b
Then you also need to match the rest of the word in order to replace it. 然后你还需要匹配单词的其余部分才能替换它。 \\w is a shorthand for [a-zA-Z0-9_] , which matches 1 word character, and that's what you need to repeat any number of times with * . \\w是[a-zA-Z0-9_]的简写，它匹配1个字符，这就是你需要用*重复任意次数。

Code 码

import re

string_reg = 'add23khh234 ... add2asdf675 ... xxxadd2axxx'

string_reg = re.sub(ur'\badd\w*', 'REMOVED', string_reg)
print(string_reg)

Output 产量

REMOVED ... REMOVED ... xxxadd2axxx

ideone demo ideone演示

Python Regex使用通配符匹配字符串的开头并替换整个字符串

问题描述

4 个解决方案

解决方案1
1 2015-11-10 04:45:32

解决方案2
0 2015-11-10 04:41:25

解决方案3
0 2015-11-10 04:46:30

解决方案4
0 2015-11-10 05:12:00

Python Regex使用通配符匹配字符串的开头并替换整个字符串

问题描述

4 个解决方案

解决方案1 1 2015-11-10 04:45:32

解决方案2 0 2015-11-10 04:41:25

解决方案3 0 2015-11-10 04:46:30

解决方案4 0 2015-11-10 05:12:00

解决方案1
1 2015-11-10 04:45:32

解决方案2
0 2015-11-10 04:41:25

解决方案3
0 2015-11-10 04:46:30

解决方案4
0 2015-11-10 05:12:00