简体   繁体   English

php preg_match模式问题,正则表达式模式

[英]php preg_match pattern problem,regular expression pattern

<tr  id='ieconn3' >
  <td><table width='100%'><tr><td valign='top'><table width='100%'><tr><td>aaaaa
<br>&nbsp;</td></tr><tr><td> 

I want to get the aaaaa part till <br> or </td> . 我想将aaaaa部分aaaaa<br></td> I tried lots of patterns but didnt work. 我尝试了很多模式,但是没有用。 any help? 有什么帮助吗?

You shouldn't try to use regular expressions to parse HTML as HTML is not a regular language and thus cannot be described with regular expressions. 您不应该尝试使用正则表达式来解析HTML,因为HTML不是正则语言,因此无法用正则表达式来描述。 Use a proper HTML parser instead. 改用正确的HTML解析器。

If you're using XHTML, you can use SimpleXML to parse it as XML and query it with SimpleXMLElement::xpath . 如果使用的是XHTML,则可以使用SimpleXML将其解析为XML并使用SimpleXMLElement :: xpath进行查询。 And for HTML documents, you can use the Simple HTML DOM Parser . 对于HTML文档,您可以使用Simple HTML DOM Parser And DOMDocument can even handle both XHTML and HTML. DOMDocument甚至可以处理XHTML和HTML。

As Gumbo pointed out, this will only result in a giant mess if you insist on using a regexp for this. 正如Gumbo指出的那样,如果您坚持为此使用正则表达式,只会导致巨大的混乱。 However, if you are sure the HTML does not chance, this one will do the trick: 但是,如果您确定HTML不会出现这种情况,那么可以做到这一点:

/<tr><td>(.*)<\\/td><\\/tr>/

use like this: 像这样使用:

$string = "<tr  id='ieconn3' >
<td><table width='100%'><tr><td valign='top'><table width='100%'><tr><td>aaaaa<br>&nbsp;</td></tr><tr><td>";

$matches = array();
preg_match("/<tr><td>(.*)<\\/td><\\/tr>/", $string, $matches);

print($matches[1]);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM