简体   繁体   English

使用 python 获取包含正则表达式特定字符的两个变量字符串之间的字符串

[英]Get a string between two variable strings that contain regex specific characters using python

so I have a string such as this:所以我有一个这样的字符串:

r'irrelevant data (~symbol)relevant data(/~symbol) irrelevant data'

and want to get to the relevant data.并希望获得相关数据。 However, the (~symbol) tag is variable, meaning that in order to find the relevant regex phrase we would need to go something like但是,(~symbol)标签是可变的,这意味着为了找到相关的正则表达式短语,我们需要 go 类似

tags = ["(~symbol)","(/~symbol)"]
string = r'irrelevant data (~symbol)relevant data(/~symbol) irrelevant data'
regex = r'{}([^"]*){}'.format(tags[0],tags[1])
result = re.findall(regex , string)[0]

the problem is that our tags contain characters that would need to be escaped if used in a regular expression, so in this case the result would contain the tags themselves instead of just the desired string.问题是我们的标签包含在正则表达式中使用时需要转义的字符,因此在这种情况下,结果将包含标签本身,而不仅仅是所需的字符串。

Is there a good solution that doesn't involve replace?有没有不涉及替换的好解决方案?

There's a lot in your question, so I'll try addressing them one-by-one:您的问题有很多,所以我将尝试一一解决:

  • For getting the "irrelevant data" in between, you might want to look into re.split .为了获得介于两者之间的“不相关数据”,您可能需要查看re.split
  • For separators with special characters, use re.escape .对于带有特殊字符的分隔符,请使用re.escape
  • To exclude the separators in the result, use non-capturing groups ( ?: ).要在结果中排除分隔符,请使用非捕获组 ( ?: :)。

For your example, it would be something like this:对于您的示例,它将是这样的:

import re
patterns = ["(~symbol)", "(/~symbol)"]
string = r'irrelevant data (~symbol)relevant data(/~symbol) irrelevant data'
result = re.split('(?:' + '|'.join(map(re.escape, patterns)) + ')', string)

which then gives然后给出

['irrelevant data ', 'relevant data', ' irrelevant data']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 Python 中使用正则表达式在两个字符串之间获取字符串 - How to get string between two strings using regex in Python 如何在python中使用正则表达式获取两个特定字符之间的第一个整数? - How to get first integer between two specific characters with regex in Python? Python-计算两个特定字符串之间的字符 - Python - Count characters between two specific strings 使用正则表达式在Python中的两个字符串之间通过 - Using regex to get passage between two strings in Python Python正则表达式在两个字符串之间找到字符串 - Python Regex to find String between two strings Python 正则表达式提取两个字符串之间的字符串 - Python regex extract a string between two strings 正则表达式公式来查找其他两个字符串或字符之间的字符串 - Regex formula to find string between two other strings or characters 使用正则表达式替换 pandas 单元格中的字符串,在两个特定字符串之间 - Replace strings in pandas cell using regex, between two specific strings 使用正则表达式或 python 函数提取两个字符串的所有相同对之间的所有字符串 - Extract all strings in between between all same pairs of the two string using regex or python functions Python在两个特定字符串的每个实例之间分割文本(Regex) - Python splitting text between every instance of two specific strings (Regex)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM