带有正则表达式的错误匹配

Question

$regexp = '/(?:<input\stype="hidden"\sname="){1}([a-zA-Z0-9]*)(?:"\svalue="1"\s\/>)/';
$response = '<input type="hidden" name="7d37dddd0eb2c85b8d394ef36b35f54f" value="1" />';
preg_match($regexp, $response, $matches);

echo $matches[1]; // Outputs: 7d37dddd0eb2c85b8d394ef36b35f54f

So I'm using this regular expression to search for an authentication token on a webpage implementing Joomla in order to preform a scripted login. 因此，我正在使用此正则表达式在实现Joomla的网页上搜索身份验证令牌，以执行脚本化登录。

I've got all this working but am wondering what is wrong with my regular expression as it always returns 2 items. 我已经完成了所有这些工作，但想知道我的正则表达式出了什么问题，因为它总是返回2个项目。

Array ( [0] => [1] => 7d37dddd0eb2c85b8d394ef36b35f54f)

Also the name of the input I'm checking for changes every page load both in length and name. 另外，我要检查的输入名称每个页面的长度和名称都会改变。

Answer 1

Nothing is wrong. 没有错误。 Item [0] always contains the entire match. 项目[0]始终包含整个匹配项。 From the docs (emphasis mine): 从文档（重点是我的）：

If matches is provided, then it is filled with the results of search. 如果提供了matches ，则将其填充为搜索结果。 $matches[0] will contain the text that matched the full pattern , $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on. $matches[0]将包含与完整模式匹配的文本 ， $matches[1]将具有与第一个捕获的带括号的子模式 $matches[1]的文本，依此类推。

Your regex (overlooking the fact that you are working on HTML with regexes in the first place, which you know you shouldn't) is a bit too complicated. 您的正则表达式（忽略了您首先使用正则表达式来处理HTML的事实，您知道不应该这样做）有点太复杂了。

$regexp = '#<input\s+type="hidden"\s+name="([0-9a-f]*)"\s+value="1"\s*/>#i'

You don't need the non-capturing groups at all. 您根本不需要非捕获组。
You use \\s , which limits you to a single character. 您使用\\s ，这会将您限制为单个字符。 \\s+ is probably better. \\s+可能更好。
Using something different than / as the regex boundary makes escaping of forward slashes in the regex unnecessary. 使用不同于/东西作为正则表达式边界，使得不必在正则表达式中转义正斜杠。
Making the regex case-insensitive could be useful, too. 使正则表达式不区分大小写也可能很有用。
The auth token looks like a hex string, so matching az is unnecessary. auth令牌看起来像一个十六进制字符串，因此不需要匹配az 。

Answer 2

As per the manual entry for preg_match : 按照preg_match的手动输入：

If matches is provided, then it is filled with the results of search. 如果提供了匹配项，则将其填充为搜索结果。 $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on. $ matches [0]将包含与完整模式匹配的文本，$ matches [1]将具有与第一个捕获的带括号的子模式匹配的文本，依此类推。

带有正则表达式的错误匹配

问题描述

2 个解决方案

解决方案1
3 已采纳 2010-04-07 07:27:36

解决方案2
0 2010-04-07 07:26:06

带有正则表达式的错误匹配

问题描述

2 个解决方案

解决方案1 3 已采纳 2010-04-07 07:27:36

解决方案2 0 2010-04-07 07:26:06

解决方案1
3 已采纳 2010-04-07 07:27:36

解决方案2
0 2010-04-07 07:26:06