简体   繁体   English

html标签之间的preg_match

[英]preg_match between html tags

I need to grab the names between the html tags. 我需要抓住html标记之间的名称。

<div class="from"><span class="profile fn">firstnamed familyname</span></div>

so far I tried according to examples from other poeple with the same question: 到目前为止,我根据其他人的例子尝试了同样的问题:

preg_match(";from"><span class="profile fn>(.?)</span></div>;", $text, $match)

but it doesn't work. 但这不起作用。

What is the correct way? 正确的方法是什么?

Thanks a lot. 非常感谢。

preg_match(";from"><span class="profile fn>(.?)</span></div>;", $text, $match)

... should trigger this: ...应触发此操作:

Parse error: syntax error, unexpected '<' 解析错误:语法错误,意外的“ <”

Apart from that: 除此之外:

  • You seek for an unclosed attribute that's not in the original text: 您要寻找原始文本中没有的未封闭属性:

    class="profile fn vs class="profile fn" class="profile fn vs class="profile fn"

  • You seek for zero or one characters: 您寻找零个或一个字符:

    .?

Fixed regexp would be: 固定的正则表达式为:

$text = '<div class="from"><span class="profile fn">firstnamed familyname</span></div>';
preg_match(';from"><span class="profile fn">(.*)</span></div>;', $text, $match);
var_dump($match);

Of course, this will probably break on large HTML documents (as soon as there's another </span></div> bit later on). 当然,这可能会在大型HTML文档上中断(稍后会再出现</span></div> )。 Regular expressions are impossible to get right when used for parsing HTML. 当用于解析HTML时,正则表达式是不可能正确的。

This: 这个:

preg_match(";from"><span class="profile fn>(.?)</span></div>;", $text, $match)

is syntactically incorrect, you have to escape the double quotes: 在语法上是不正确的,您必须转义双引号:

preg_match(";from\"><span class=\"profile fn>(.?)</span></div>;", $text, $match)

您需要转义特殊字符(例如引号):

preg_match(";from\"><span class\=\"profile fn>(.?)</span></div>;", $text, $match)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM