[英]Get a string between two variable strings that contain regex specific characters using python
so I have a string such as this:所以我有一个这样的字符串:
r'irrelevant data (~symbol)relevant data(/~symbol) irrelevant data'
and want to get to the relevant data.并希望获得相关数据。 However, the (~symbol) tag is variable, meaning that in order to find the relevant regex phrase we would need to go something like
但是,(~symbol)标签是可变的,这意味着为了找到相关的正则表达式短语,我们需要 go 类似
tags = ["(~symbol)","(/~symbol)"]
string = r'irrelevant data (~symbol)relevant data(/~symbol) irrelevant data'
regex = r'{}([^"]*){}'.format(tags[0],tags[1])
result = re.findall(regex , string)[0]
the problem is that our tags contain characters that would need to be escaped if used in a regular expression, so in this case the result would contain the tags themselves instead of just the desired string.问题是我们的标签包含在正则表达式中使用时需要转义的字符,因此在这种情况下,结果将包含标签本身,而不仅仅是所需的字符串。
Is there a good solution that doesn't involve replace?有没有不涉及替换的好解决方案?
There's a lot in your question, so I'll try addressing them one-by-one:您的问题有很多,所以我将尝试一一解决:
re.split
.re.split
。re.escape
.re.escape
。?:
).?:
:)。 For your example, it would be something like this:对于您的示例,它将是这样的:
import re
patterns = ["(~symbol)", "(/~symbol)"]
string = r'irrelevant data (~symbol)relevant data(/~symbol) irrelevant data'
result = re.split('(?:' + '|'.join(map(re.escape, patterns)) + ')', string)
which then gives然后给出
['irrelevant data ', 'relevant data', ' irrelevant data']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.