[英]regular expression to get text between brackets that have text between brackets
After trying 10 times to rewrite this question to be accepted , i have a small text that have text between brackets, i want to extract that text so i wrote this expression : 尝试10次重写这个问题后,我有一个小文本,括号之间有文本,我想提取该文本,所以我写了这个表达式:
/(\([^\)]+\))/i
but this only extracts text between first (
and last )
ignoring the rest of text so is there any way to extract full text like : 但是这只会在第一个
(
和最后一个)
之间提取文本而忽略其余的文本,所以有没有办法提取全文,如:
i want(to) extract this text
from : 来自:
this is the text that (i want(to) extract this text) from
there might be more than one bracket enclosed sub-text . 可能有多个括号括起来的子文本。
Thanks 谢谢
EDIT Found this : 编辑发现这个:
preg_match_all("/\((([^()]*|(?R))*)\)/", $rejoin, $matches);
very usefull from the link provided in the accepted answer 从接受的答案中提供的链接非常有用
Yes you can use this pattern 是的,你可以使用这种模式
v v
(\([^\)\(]*)+([^\)\(]*\))+
------------ -------------
| |
| |->match all (right)brackets to the right..
|
|->match all (left)brackets to the left
Above pattern won't work if you have a recursive pattern like this 如果你有这样的递归模式,上面的模式将不起作用
(i want(to) (extract and also (this)) this text)
------
-------------------------
In this case you can use the recursive pattern as recommended by elclanrs 在这种情况下,您可以使用elclanrs建议的递归模式
You can also do it without without using regex by maintaining a count of number of (
and )
你也可以通过保持
(
和)
的数量计数而不使用正则表达式来做到这一点
So, assume noOfLB
is the count of (
and noOfRB
is the count of )
因此,假设
noOfLB
是计数(
并且noOfRB
是计数)
(
(
)
)
I don't know php so I would implement above algo in c# 我不知道php所以我会在c#上面实现algo
public static string getFirstRecursivePattern(string input)
{
int firstB=input.IndexOf("("),noOfLB=0,noOfRB=0;
for(int i=firstB;i<input.Length && i>=0;i++)
{
if(input[i]=='(')noOfLB++;
if(input[i]==')')noOfRB++;
if(noOfLB==noOfRB)return input.Substring(firstB,i-firstB+1);
}
return "";
}
You will need recursive subpatterns to solve this. 您将需要递归子模式来解决此问题。 Here is the regex that should work for you:
这是应该适合您的正则表达式:
$str = 'this is the text that (i want(to) extract this text) from';
if (preg_match('/\s* \( ( (?: [^()]* | (?0) )+ ) \) /x', $str, $arr))
var_dump($arr);
OUTPUT: OUTPUT:
string(28) "i want(to) extract this text"
You can also use substrings: 您还可以使用子字符串:
$yourString = "this is the text that (i want(to) extract this text) from";
$stringAfterFirstParen = substr( strstr( $yourString, "(" ), 1 );
$indexOfLastParen = strrpos( $stringAfterFirstParen, ")" );
$stringBetweenParens = substr( $stringAfterFirstParen, 0, $indexOfLastParen );
I think I understand the question and that is that you would like to extract "i want(to) extract this text" or something similar from something that might appear like this: this is the text that (i want(to) extract this text) from 我想我理解这个问题,那就是你想提取“我想要(提取)这个文本”或类似的东西,可能会出现这样的东西:这是(我想要(提取)这个文本的文本)来自
If so, you might find success with the following regular expression (using $text to define the variable being examined and $txt as the variable being created in the case of a match which is then stored in the array $t[]): 如果是这样,您可能会发现使用以下正则表达式成功(使用$ text定义要检查的变量,将$ txt作为匹配情况下创建的变量,然后将其存储在数组$ t []中):
if (preg_match('/\(\w+.+\)/', $text, $t)) {
$txt = $t[0];
} else {
$txt = "";
}
echo $desired=substr($txt,1,-1);
The RegEx at the root of this is: (\\w+.+) and here is the explanation of the code: RegEx的根源是:(\\ w +。+),这里是代码的解释:
Using the above I was able to display: i want(to) extract this text from the variable $text = this is the text that (i want(to) extract this text) from. 使用上面我能够显示:我希望(从)变量$ text中提取此文本=这是(我希望(以)提取此文本)的文本。 If desire to pull the "to" from the (to) I would suggest that you run the variable through the regex loop until there are no more ( )'s found in the expression and it returns a null value and concatenate the returned values to form the variable of interest.
如果希望从(to)中拉出“to”,我建议你通过正则表达式循环运行变量,直到在表达式中找不到更多的(),它返回一个空值并将返回的值连接到形成感兴趣的变量。
Best of luck, Steve 祝你好运,史蒂夫
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.