简体   繁体   English


[英]Regular Expression to ignore a link text

I have the following code: 我有以下代码:

<p>&nbsp;<img src="spas01.jpg" alt="" width="630" height="480"></p>
<p style="text-align: right;"><a href="spas.html">Spas</a></p>
<p>My Site Content [...]</p>

I need a regular expression to get only the "My Site Content [...]". 我需要一个正则表达式以仅获取“我的网站内容[...]”。 So, i need to ignore first image (and maybe other) and links. 因此,我需要忽略第一张图片(也许还有其他图片)和链接。

Try This: 尝试这个:
Use (?<=<p>)([^><]+)(?=</p>) or <p>\\K([^><]+)(?=</p>) 使用(?<=<p>)([^><]+)(?=</p>)<p>\\K([^><]+)(?=</p>)

Update 更新资料

   $re = "@<p>\\K([^><]+)(?=</p>)@m"; 
$str = "<p>&nbsp;<img src=\"spas01.jpg\" alt=\"\" width=\"630\" height=\"480\"></p>\n<p style=\"text-align: right;\"><a href=\"spas.html\">Spas</a></p>\n<p>My Site Content [...]</p>"; 

preg_match_all($re, $str, $matches);

Demo 演示版

With DOMDocument and DOMXPath: 使用DOMDocument和DOMXPath:

$html = <<<'EOD'
<p>&nbsp;<img src="spas01.jpg" alt="" width="630" height="480"></p>
<p style="text-align: right;"><a href="spas.html">Spas</a></p>
<p>My Site Content [...]</p>

$dom = new DOMDocument;

$xp = new DOMXPath($dom);
$query = '//p//text()[not(ancestor::a)]';

$textNodes = $xp->query($query);

foreach ($textNodes as $textNode) {
    echo $textNode->nodeValue . PHP_EOL;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM