[英]Regular expression for substitution of similar pattern in a string in Python
I want to use a regular expression to detect and substitute some phrases.我想使用正则表达式来检测和替换一些短语。 These phrases follow the same pattern but deviate at some points.
这些短语遵循相同的模式,但在某些方面有所不同。 All the phrases are in the same string.
所有的短语都在同一个字符串中。
For instance I have this string:例如我有这个字符串:
/this/is//an example of what I want /to///do
I want to catch all the words inside and including the // and substitute them with "".我想捕获里面的所有单词,包括 // 并将它们替换为“”。
To solve this, I used the following code:为了解决这个问题,我使用了以下代码:
import re
txt = "/this/is//an example of what i want /to///do"
re.search("/.*/",txt1, re.VERBOSE)
pattern1 = r"/.*?/\w+"
a = re.sub(pattern1,"",txt)
The result is:结果是:
' example of what i want '
which is what I want, that is, to substitute the phrases within // with "".这就是我想要的,即将 // 中的短语替换为“”。 But when I run the same pattern on the following sentence
但是当我在下面的句子上运行相同的模式时
"/this/is//an example of what i want to /do"
I get我得到
' example of what i want to /do'
How can I use one regex and remove all the phrases and //, irrespective of the number of // in a phrase?如何使用一个正则表达式并删除所有短语和 //,而不考虑短语中 // 的数量?
In your example code, you can omit this part re.search("/.*/",txt1, re.VERBOSE)
as is executes the command, but you are not doing anything with the result.在您的示例代码中,您可以在执行命令时省略这部分
re.search("/.*/",txt1, re.VERBOSE)
,但您没有对结果做任何事情。
You can match 1 or more /
followed by word chars:您可以匹配 1 个或多个
/
后跟单词字符:
/+\w+
Or a bit broader match, matching one or more /
followed by all chars other than /
or a whitspace chars:或者更广泛的匹配,匹配一个或多个
/
后跟除/
或空白字符以外的所有字符:
/+[^\s/]+
/+
Match 1+ occurrences of /
/+
匹配 1+ 个/
[^\\s/]+
Match 1+ occurrences of any char except a whitespace char or /
[^\\s/]+
匹配 1+ 次出现的任何字符,除了空白字符或/
import re
strings = [
"/this/is//an example of what I want /to///do",
"/this/is//an example of what i want to /do"
]
for txt in strings:
pattern1 = r"/+[^\s/]+"
a = re.sub(pattern1, "", txt)
print(a)
Output输出
example of what I want
example of what i want to
You can use您可以使用
/(?:[^/\s]*/)*\w+
See the regex demo .请参阅正则表达式演示。 Details :
详情:
/
- a slash /
- 斜线(?:[^/\\s]*/)*
- zero or more repetitions of any char other than a slash and whitespace (?:[^/\\s]*/)*
- 除斜杠和空格以外的任何字符的零次或多次重复\\w+
- one or more word chars. \\w+
- 一个或多个单词字符。 See the Python demo :请参阅Python 演示:
import re
rx = re.compile(r"/(?:[^/\s]*/)*\w+")
texts = ["/this/is//an example of what I want /to///do", "/this/is//an example of what i want to /do"]
for text in texts:
print( rx.sub('', text).strip() )
# => example of what I want
# example of what i want to
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.