[英]Use regular expressions to match an ? but not a \?
I have a PHP regular expression that has been functioning fairly well to parse some odd legacy client templates until recently when we found an escaped question mark (\\?) included in a template expression. 我有一个PHP正则表达式,它可以很好地解析一些奇怪的遗留客户端模板,直到最近我们发现模板表达式中包含一个转义的问号(\\?)。 I'm not strong enough with my regular expression-fu to wrap my feeble noodle around a negative look ahead or some techno-mumbo-jumbo so, tips or points in the right direction would be greatly appreciated.
我的常规表达能力不足以将我的软弱的面条包裹在一个负向的前方或者一些技术 - 巨型 - 巨型表面上,所以,正确方向的提示或点将非常感激。
My PHP: 我的PHP:
preg_match_all("/\{\{IF (.*)\?(.*):(.*)\}\}/U", $template, $m, PREG_SET_ORDER);
Okay, I was a little overwhelmed when I posted this question. 好吧,当我发布这个问题时,我有点不知所措。 Allow me to put it into proper context.
请允许我将其置于适当的背景下。
Template code looks like this: 模板代码如下所示:
{{IF VAR?"SHOW: THIS?":"SHOW {{ELSE}}"}}
Which should be parsed as: 哪个应解析为:
if ($template[$var]) {
echo "SHOW: THIS?";
} else {
echo "SHOW ".$template['ELSE'];
}
I am currently almost achieving this with my function, but not entirely. 我目前几乎用我的功能实现了这一点,但并非完全如此。 This is the function:
这是功能:
preg_match_all("/\{\{IF ((?:[^\\?]|\\.)*)\?((?:[^\\:]|\\.)*):(.*)\}\}[^<\/]/", $template, $m, PREG_SET_ORDER);
if (count($m)) {
foreach ($m as $o) {
if (preg_match("/(.*)\s+(==|!=)\s+(.*)/", $o[1], $x)) {
if (preg_match("/^\"(.*)\"/", $x[1], $cx)) $e1 = $cx[1];
else $e1 = is_numeric($x[1])?$x[1]:$data[$x[1]];
if (preg_match("/^\"(.*)\"/", $x[3], $cx)) $e2 = $cx[1];
else $e2 = is_numeric($x[3])?$x[3]:$data[$x[3]];
if (preg_match("/^\"(.*)\"/", $o[2], $ox)) $er[0] = $ox[1];
else $er[0] = addslashes(htmlspecialchars($data[$o[2]]));
if (preg_match("/^\"(.*)\"/", $o[3], $ox)) $er[1] = $ox[1];
else $er[1] = addslashes(htmlspecialchars($data[$o[3]]));
$eval = "\$od = (\"$e1\" $x[2] \"$e2\")?\"$er[0]\":\"$er[1]\";";
eval($eval);
} else {
$od = $data[$o[1]]?$o[2]:$o[3];
if (preg_match("/^\"(.*)\"/", $od, $x)) $od = $x[1];
else $od = $data[$od];
}
$template = str_replace($o[0], $od, $template);
}
}
if (is_array($data))
foreach ($data as $k => $v) $template = str_replace('{{'.$k.'}}', $v, $template);
return $template;
You need to change your (.*)
regions—it's no longer true that you want to match a sequence of anything. 你需要改变你的
(.*)
区域 - 你想要匹配一系列任何东西都不再是真的。 Instead, you want to match a sequence of non-escaped characters or escape sequences: ((?:[^\\\\]|\\\\.)*)
That will match any string containing backslashed escapes. 相反,您希望匹配一系列非转义字符或转义序列:
((?:[^\\\\]|\\\\.)*)
这将匹配包含反斜杠转义的任何字符串。 I think you could possibly improve performance by specifying that you don't want to match question marks or colons where you can't; 我认为你可以通过指定你不想匹配你不能匹配的问号或冒号来提高性能; if you did this, you'd end up with the regex
/\\{\\{IF ((?:[^\\\\?]|\\\\.)*)\\?((?:[^\\\\:]|\\\\.)*):(.*)\\}\\}/
. 如果你这样做,你最终会得到正则表达式
/\\{\\{IF ((?:[^\\\\?]|\\\\.)*)\\?((?:[^\\\\:]|\\\\.)*):(.*)\\}\\}/
。 While that looks nasty, I've just substituted your (.*)
s with the construction I have from above; 虽然这看起来很讨厌,但我只是用你上面的结构代替你的
(.*)
s; it's pretty straightforward. 这很简单。
为什么不
(.*)[^\\]\?(.*)
Here's what worked. 这是有效的。 Thanks to @absz for a point in the right direction.
感谢@absz指出正确的方向。
preg_match_all("/\{\{IF ([^\"\\]]*(\\.[^\"\\]]*)*)\?((?:[^\\:]|\\.)*):(.*)}\}/", $template, $m, PREG_SET_ORDER);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.