正则表达式返回两个特殊字符之间的所有字符

Question

How would I go about using regx to return all characters between two brackets. 我如何使用regx返回两个括号之间的所有字符。 Here is an example: 这是一个例子：

foobar['infoNeededHere']ddd
needs to return infoNeededHere

I found a regex to do it between curly brackets but all attempts at making it work with square brackets have failed. 我在大括号之间找到了一个正则表达式，但所有尝试使用方括号的尝试都失败了。 Here is that regex: (?<={)[^}]*(?=}) and here is my attempt to hack it 这是正则表达式： (?<={)[^}]*(?=})这是我试图破解它

(?<=[)[^}]*(?=])

Final Solution: 最终解决方案

import re

str = "foobar['InfoNeeded'],"
match = re.match(r"^.*\['(.*)'\].*$",str)
print match.group(1)

Answer 1

If you're new to REG (gular) EX (pressions) you learn about them at Python Docs . 如果您是REG （gular） EX （新闻）的新手，您可以在Python Docs中了解它们。 Or, if you want a gentler introduction, you can check out the HOWTO . 或者，如果您想要更温和的介绍，可以查看HOWTO 。 They use Perl-style syntax. 他们使用Perl风格的语法。

Regex 正则表达式

The expression that you need is .*?\\[(.*)\\].* . 你需要的表达式是.*?\\[(.*)\\].* 。 The group that you want will be \\1 . 你想要的小组将是\\1 。
- .*? - .*? : . ： . matches any character but a newline. 匹配任何字符，但换行。 * is a meta-character and means Repeat this 0 or more times . *是元字符，表示重复此次0次或更多次 。 ? makes the * non-greedy, ie, . 使*非贪婪，即. will match up as few chars as possible before hitting a '['. 在击中'['之前，将尽可能少的字符匹配。
- \\[ : \\ escapes special meta-characters, which in this case, is [ . - \\[ ： \\转义特殊元字符，在本例中为[ 。 If we didn't do that, [ would do something very weird instead. 如果我们不这样做， [会做一些非常奇怪的事情。
- (.*) : Parenthesis 'groups' whatever is inside it and you can later retrieve the groups by their numeric IDs or names (if they're given one). - (.*) ：括号'分组'其中的任何内容，您可以稍后通过其数字ID或名称检索组（如果它们被赋予一个）。
- \\].* : You should know enough by now to know what this means. - \\].* ：你现在应该足够了解这意味着什么。

Implementation 履行

First, import the re module -- it's not a built-in -- to where-ever you want to use the expression. 首先，将re模块 - 它不是内置的 - 导入到你想要使用表达式的地方。

Then, use re.search(regex_pattern, string_to_be_tested) to search for the pattern in the string to be tested. 然后，使用re.search(regex_pattern, string_to_be_tested)搜索要测试的字符串中的模式。 This will return a MatchObject which you can store to a temporary variable. 这将返回一个MatchObject ，您可以将其存储到临时变量中。 You should then call it's group() method and pass 1 as an argument (to see the 'Group 1' we captured using parenthesis earlier). 然后，您应该调用它的group()方法并将1作为参数传递（以查看我们之前使用括号捕获的“组1”）。 I should now look like: 我现在应该看起来像：

>>> import re
>>> pat = r'.*?\[(.*)].*'             #See Note at the bottom of the answer
>>> s = "foobar['infoNeededHere']ddd"
>>> match = re.search(pat, s)
>>> match.group(1)
"'infoNeededHere'"

An Alternative 替代

You can also use findall() to find all the non-overlapping matches by modifying the regex to (?>=\\[).+?(?=\\]) . 您还可以使用findall()通过将正则表达式修改为(?>=\\[).+?(?=\\])来查找所有非重叠匹配。
- (?<=\\[) : (?<=) is called a look-behind assertion and checks for an expression preceding the actual match. - (?<=\\[) : (?<=)被称为后视断言并检查实际匹配之前的表达式。
- .+? - .+? : + is just like * except that it matches one or more repititions. ： +就像*只是它匹配一个或多个repititions。 It is made non-greedy by ? 它是非贪婪的? . 。
- (?=\\]) : (?=) is a look- ahead assertion and checks for an expression following the match w/o capturing it. - (?=\\]) (?=)是前瞻判断和检查表达式跟随比赛的w / o捕获它。
Your code should now look like: 您的代码现在应该如下所示：

>>> import re
>>> pat = r'(?<=\[).+?(?=\])'  #See Note at the bottom of the answer
>>> s = "foobar['infoNeededHere']ddd[andHere] [andOverHereToo[]"
>>> re.findall(pat, s)
["'infoNeededHere'", 'andHere', 'andOverHereToo[']

Note: Always use raw Python strings by adding an 'r' before the string (Eg: r'blah blah blah' ). 注意：始终使用原始Python字符串，在字符串前添加“r”（例如： r'blah blah blah' ）。

10x for reading! 10倍阅读！ I wrote this answer when there were no accepted ones yet, but by the time I finished it, 2 ore came up and one got accepted. 当没有被接受的时候我写了这个答案，但是当我完成它的时候，有2个矿石出现了，一个被接受了。 :( x< :( x <

Answer 2

^.*\\['(.*)'\\].*$ will match a line and capture what you want in a group. ^.*\\['(.*)'\\].*$将匹配一行并捕获组中的内容。

You have to escape the [ and ] with \\ 你必须逃避[和]与\\

The documentation at the rubular.com proof link will explain how the expression is formed. rubular.com 证明链接中的文档将解释表达式是如何形成的。

Answer 3

If there's only one of these [.....] tokens per line, then you don't need to use regular expressions at all: 如果每行只有一个[.....]标记，那么你根本不需要使用正则表达式：

In [7]: mystring = "Bacon, [eggs], and spam"

In [8]: mystring[ mystring.find("[")+1 : mystring.find("]") ]
Out[8]: 'eggs'

If there's more than one of these per line, then you'll need to modify Jarrod's regex ^.*\\['(.*)'\\].*$ to match multiple times per line, and to be non greedy. 如果每行不止一个，那么你需要修改Jarrod的正则表达式^.*\\['(.*)'\\].*$以匹配每行多次，并且非贪婪。 (Use the .*? quantifier instead of the .* quantifier.) （使用.*?量词而不是.*量词。）

In [15]: mystring = "[Bacon], [eggs], and [spam]."

In [16]: re.findall(r"\[(.*?)\]",mystring)
Out[16]: ['Bacon', 'eggs', 'spam']

正则表达式返回两个特殊字符之间的所有字符

问题描述

3 个解决方案

解决方案1
31 2012-03-27 14:41:12

Regex 正则表达式

Implementation 履行

An Alternative 替代

解决方案2
20 已采纳

解决方案3
10 2012-03-27 12:56:38

正则表达式返回两个特殊字符之间的所有字符

问题描述

3 个解决方案

解决方案1 31 2012-03-27 14:41:12

Regex 正则表达式

Implementation 履行

An Alternative 替代

解决方案2 20 已采纳

解决方案3 10 2012-03-27 12:56:38

解决方案1
31 2012-03-27 14:41:12

解决方案2
20 已采纳

解决方案3
10 2012-03-27 12:56:38