使用正则表达式精确搜索带括号的字符串

Question

I am new to regexes.我是正则表达式的新手。

I have the following string: \n(941)\n364\nShackle\n(941)\nRivet\n105\nTop我有以下字符串： \n(941)\n364\nShackle\n(941)\nRivet\n105\nTop

Out of this string, I want to extract Rivet and I already have (941) as a string in a variable.从这个字符串中，我想提取Rivet并且我已经将(941)作为变量中的字符串。

My thought process was like this:我的思考过程是这样的：

Find all the (941) s找到所有(941) s
filter the results by checking if the string after (941) is followed by \n, followed by a word, and ending with \n通过检查(941)之后的字符串是否后跟 \n、后跟一个单词并以 \n 结尾来过滤结果
I made a regex for the 2nd part: \n[\w\s\'\d\-\/\.]+$\n .我为第二部分做了一个正则表达式： \n[\w\s\'\d\-\/\.]+$\n 。

The problem I am facing is that because of the parenthesis in (941) the regex is taking 941 as a group.我面临的问题是，由于(941)中的括号，正则表达式将 941 作为一个组。 In the 3rd step the regex may be wrong, which I can fix later, but 1st I needed help in finding the 2nd (941) so then I can apply the 3rd step on that.在第三步中，正则表达式可能是错误的，我可以稍后修复，但第一步我需要帮助来找到第二步(941) ，所以我可以应用第三步。

PS. PS。

I know I can use python string methods like find and then loop over the searches, but I wanted to see if this can be done directly using regex only.我知道我可以使用 python 字符串方法，如 find 然后循环搜索，但我想看看这是否可以直接使用正则表达式来完成。
I have tried the following regex: (?:...) , (941){1} and the make regex literal character \ like this $941$ with no useful results.我尝试了以下正则表达式： (?:...) ， (941){1}和像这样$941$的 make 正则表达式文字字符\没有有用的结果。 Maybe I am using them wrong.也许我用错了。

Just wanted to know if it is possible to be done using regex.只是想知道是否可以使用正则表达式来完成。 Though it might be useful for others too or a good share for future viewers.虽然它可能对其他人也有用，或者对未来的观众来说是一个很好的分享。

Thanks谢谢

Answer 1

Assuming:假设：

You want to avoid matching only digits;你想避免只匹配数字；
Want to match a substring made of word-characters (thus including possible digits);想要匹配由单词字符组成的 substring（因此包括可能的数字）；

Try to escape the variable and use it in the regular expression through f-string:尝试转义变量并通过 f-string 在正则表达式中使用它：

import re
s = '\n(941)\n364\nShackle\n(941)\nRivet\n105\nTop'
var1 = '(941)'
var2 = re.escape(var1)
m = re.findall(fr'{var2}\n(?!\d+\n)(\w+)', s)[0]
print(m)

Prints:印刷：

Rivet

Answer 2

If you have text in a variable that should be matched exactly, use re.escape() to escape it when substituting into the regexp.如果变量中的文本应该完全匹配，请在替换到正则表达式时使用re.escape()将其转义。

s = '\n(941)\n364\nShackle\n(941)\nRivet\n105\nTop'
num = '(941)'
re.findall(rf'(?<=\n{re.escape(num)}\n)[\w\s\'\d\-\/\.]+(?=\n)', s)

This puts (941)\n in a lookbehind, so it's not included in the match.这会将(941)\n放在后面，因此它不包含在匹配中。 This avoids a problem with the \n at the end of one match overlapping with the \n at the beginning of the next.这避免了一场比赛结束时的\n与下一场比赛开始时的\n重叠的问题。

使用正则表达式精确搜索带括号的字符串

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-07-28 15:14:09

解决方案2
1 2022-07-28 15:26:56

使用正则表达式精确搜索带括号的字符串

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-07-28 15:14:09

解决方案2 1 2022-07-28 15:26:56

解决方案1
1 已采纳 2022-07-28 15:14:09

解决方案2
1 2022-07-28 15:26:56