In LaTeX, the expression \\o{a}{b}
means the operator 'o' takes two arguments a and b. LaTeX also accepts \\o{a}
, and in this case treats the second argument as the empty string.
Now I try to match the regex \\\\\\\\o\\{([\\s\\S]*?)\\}\\{([\\s\\S]*?)\\}
against the string \\o{a}\\o{a}{b}
. It mistakes the whole string to be a match when it isn't. (The correct interpretation of this string is that the substring \\o{a}{b}
is the only match.) The point is I need to know how to tell PHP to recognise that if there is something else than { following the first }, then it is not a match.
How should I do that?
Edit : Arguments of an operator are allowed to contain the symbols \\
, {
and }
. But in this case the reason the whole string is not a match is because the curly brackets in a}\\o{a
do not conform to LaTeX rules (eg {
must come before }
), so that a}\\o{a
cannot be an argument of an operator...
Edit2 : On the other hand, \\o{{a}}{b}
should be a match as {a}
is a valid argument.
I suggest something like this:
$s = '\\o{a}\\o{a}{b}';
echo "$s\n"; # Check string
preg_match('~\\\o(\{(?>[^{}\\\]++|(?1)|\\\.)+\}){2}~', $s, $match);
print_r($match);
The regex:
[^{}\\\\\\]
and \\\\\\.
) to avoid taking literal braces for syntactical braces. \\\o # Matches \o
( # Recursive group to be
\{ # Matches {
(?> # Begin atomic group (just a group that makes the regex faster)
[^{}\\\]++ # Any characteres except braces and backslash
|
(?1) # Or recurse the outer group
|
\\\. # Or match an escaped character
)+ # As many times as necessary
\} # Closing brace
){2} # Repeat twice
The problem with your current regex is that once this part matched \\\\\\\\o\\{([\\s\\S]*?)
, it will try to look for the next \\}
that is coming, and there, it matters not whether you are using a lazy quantifier or a greedy one. You need to somehow prevent it to match }
before the actual \\}
comes in the regex.
That's why you have to use [^{}]
and since you actually can have nested braces inside, that's the ideal situation to use recursion.
to deal with possible nested curly brackets you need to use the recursion feature:
$pattern = <<<'EOD'
~
\\o({(?>[^{}]+|(?-1))*}){2}
~x
EOD;
where (?-1)
is a reference to the subpattern of the last capturing group.
I would guess you need to look into using anchors ^
and $
$pattern = '/^\\o\{.*\}(\{.*\})?$/';
I don't know what you consider aceptable values for a
and b
, so you can replace .*
with an appropriate class here.
This allows either \\0{a}
or \\o{a}{b}
formats. To match only \\o{a}{b}
modify to this:
$pattern = '/^\\o\{.*\}\{.*\}$/';
Based on your last edit, I would suggest replacing .*
in above with [^{]*
as noted in other answers.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.