简体   繁体   English

PHP中preg_match_all上的Fastcgi 500错误

[英]Fastcgi 500 error on preg_match_all in PHP

I'm trying to set up some exotic PHP code (I'm not an expert), and I get a FastCGI Error 500 on a PHP line containing 'preg_match_all'. 我正在尝试设置一些奇异的PHP代码(我不是专家),并且在包含“ preg_match_all”的PHP行上出现FastCGI错误500。

When I comment out the line, the page is returned with a 200 (but not how it was meant to be). 当我注释掉该行时,页面返回200(但不是它的原意)。

The code is parsing PHP, HTML and JavaScript content loaded from the database and is composing them to return the finished page. 该代码将解析从数据库加载的PHP,HTML和JavaScript内容,并将它们组成以返回完成的页面。

Now, by placing around some error_log entries I could determine that the line with the preg_match_all is the cause of the 500. However the line is hit multiple times during the loading of the page and on other occasions, the line does not cause an error. 现在,通过放置一些error_log条目,我可以确定带有preg_match_all的行是500的原因。但是,在页面加载过程中该行多次被击中,在其他情况下,该行不会导致错误。

Here's how it looks like exactly: 看起来是这样的:

preg_match_all ("/(<([\w]+)[^>]*>)((?:.|\n)*)(<\/\\2>)/",
                $part['data'], $tags, PREG_PATTERN_ORDER|PREG_OFFSET_CAPTURE);

The subject string is a piece of text that looks like: 主题字符串是一段类似于以下内容的文本:

<script> ... some javascript functions ... </script>

Edit: This is code that is up and running correctly elsewhere, so this very well could be a PHP setting or environment difference. 编辑:这是在其他地方正常运行的代码,因此这很可能是PHP设置或环境差异。 I'm using PHP 5.2.13 on IIS6 with FastCGI. 我在带有FastCGI的IIS6上使用PHP 5.2.13。

Edit: Nothing is mentioned in the log files. 编辑:日志文件中未提及任何内容。 At least not in the ones I checked: 至少我没有检查过:

  • IIS Logs IIS日志
  • Event Logs 事件记录
  • PHP Log PHP日志

Edit: jab11 has pointed out the problem , but there's no solution yet: 编辑: jab11 指出了问题 ,但还没有解决方案:

Any thoughts or direction would be welcome. 任何想法或方向都将受到欢迎。

Any chance that $part['data'] might be extremely big? $part['data']可能会非常大吗? I used to get 500 error on preg_match_all when I used it on strings bigger than 100 KB. 当我在大于100 KB的字符串上使用preg_match_all时,会出现500错误。

This is a wonderful example why it's a bad idea to process HTML with regular expressions. 这是一个很好的例子,为什么用正则表达式处理HTML是个坏主意。 I'm willing to bet you're running into a Stack Overflow because the HTML source string is containing some unclosed tags, making the regex try all sorts of permutations in its futile attempt to find a closing tag ( </\\2> ). 我敢打赌,由于HTML源字符串包含一些未关闭的标签,因此您正陷入堆栈溢出的局面,这使得regex尝试了各种各样的排列,以徒劳无益地尝试找到关闭标签( </\\2> )。 In an HTML file of 32 KB, it's easy to throw your regex off the trolley. 在32 KB的HTML文件中,可以很容易地将正则表达式从手推车上扔下来。 Perhaps the stack is a different size on a different server so it works on one but not the other. 也许堆栈在另一台服务器上的大小是不同的,所以它可以在一个服务器上工作,而在另一个服务器上不能工作。

A quick test: 快速测试:

I applied the regex to the source code of this page (after having removed the closing </html> tag). 我将正则表达式应用于了此页面的源代码(在删除了</html>标记后)。 RegexBuddy promptly went catatonic for about a minute before then matching the <head> and <body> tags (successfully). RegexBuddy立即进行了约一分钟的消声处理,然后与<head><body>标记匹配(成功)。 Debugging the regex from <html> on showed that it took the regex engine 970257 steps to find out that it couldn't match. <html>调试正则表达式表明,它花了正则表达式引擎970257步骤来发现它不匹配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM