简体   繁体   English

PHP正则表达式:排除href锚标记

[英]PHP Regular expression: exclude href anchor tags

I'm creating a simple search for my application. 我正在创建一个简单的搜索我的应用程序。

I'm using PHP regular expression replacement (preg_replace) to look for a search term (case insensitive) and add <strong> tags around the search term. 我正在使用PHP正则表达式替换(preg_replace)来查找搜索项(不区分大小写)并在搜索项周围添加<strong>标记。

preg_replace('/'.$query.'/i', '<strong>$0</strong>', $content);

Now I'm not the greatest with regular expressions. 现在我对正则表达式不是最好的。 So what would I add to the regular expression to not replace search terms that are in a href of an anchor tag? 那么我将添加到正则表达式中以不替换锚标记的href中的搜索项?

That way if someone searched "info" it wouldn't change a link to "http://something.com/this_ <strong> info </strong> /index.html" 这样,如果有人搜索“信息”,则不会更改指向“http://something.com/this_ <strong> info </strong> /index.html”的链接

I believe you will need conditional subpatterns] for this purpose: 我相信你需要条件子模式]为此目的:

$query = "link";
$query = preg_quote($query, '/');

$p = '/((<)(?(2)[^>]*>)(?:.*?))*?(' . $query . ')/smi';
$r = "$1<strong>$3</strong>";

$str = '<a href="/Link/foo/the_link.htm">'."\n".'A Link</a>'; // multi-line text
$nstr = preg_replace($p, $r,  $str);
var_dump( $nstr );

$str = 'Its not a Link'; // non-link text
$nstr = preg_replace($p, $r,  $str);
var_dump( $nstr );

Output: (view source) 输出:(查看源代码)

string(61) "<a href="/Link/foo/the_link.htm"> 
A <strong>Link</strong></a>"
string(31) "Its not a <strong>Link</strong>"

PS: Above regex also takes care of multi-line replacement and more importantly it ignores matching not only href but any other HTML entity enclosed in < and > . PS:上面的正则表达式还负责多行替换,更重要的是它忽略了匹配不仅仅是 href而是< / >包含的任何其他HTML实体。

EDIT: If you just want to exclude hrefs and not all html entities then use this pattern instead of above in my answer: 编辑:如果您只想排除hrefs而不是所有html实体,那么在我的答案中使用此模式而不是上面的模式:

$p = '/((<)(?(2).*?href=[^>]*>)(?:.*?))*?(' . $query . ')/smi';

I'm not 100% what you are ultimately after here, but from what I can, it's a sort of "search phrase" highlighting facility, which highlights keywords so to speak. 我不是100%你最终在这之后,但是从我能做到的,它是一种“搜索短语”突出显示设施,可以强调关键词。 If so, I suggest having a look at the Text Helper in CodeIgniter. 如果是这样,我建议看看CodeIgniter中的Text Helper。 It provides a nice little function called highlight_phrase and this could do what you are looking for. 它提供了一个很好的小函数,名为highlight_phrase ,这可以做你想要的。

The function is as follows. 功能如下。

function highlight_phrase($str, $phrase, $tag_open = '<strong>', $tag_close = '</strong>')
{
    if ($str == '')
    {
        return '';
    }

    if ($phrase != '')
    {
        return preg_replace('/('.preg_quote($phrase, '/').')/i', $tag_open."\\1".$tag_close, $str);
    }

    return $str;
}

You may use conditional subpatterns, see explanation here: http://cz.php.net/manual/en/regexp.reference.conditional.php 您可以使用条件子模式,请参阅此处的说明: http//cz.php.net/manual/en/regexp.reference.conditional.php

preg_replace("/(?(?<=href=\")([^\"]*\")|($query))/i","\\1<strong>\\2</strong>",$x);

In your case, if you have whole HTML, not just href="" , there is an easier solution using 'e' modifier, which enables you using PHP code in replacing matches 在你的情况下,如果你有完整的HTML,而不仅仅是href="" ,使用'e'修饰符有一个更简单的解决方案,它允许你使用PHP代码替换匹配

function termReplacer($found) {
  $found = stripslashes($found);
  if(substr($found,0,5)=="href=") return $found;
  return "<strong>$found</strong>";
}
echo preg_replace("/(?:href=)?\S*$query/e","termReplacer('\\0')",$x);

See example #4 here http://cz.php.net/manual/en/function.preg-replace.php If your expression is even more complex, you can use regExp even inside termReplacer() . 请参阅示例#4 http://cz.php.net/manual/en/function.preg-replace.php如果您的表达式更复杂,您甚至可以在termReplacer()使用regExp。

There is a minor bug in PHP : the $found parameter in termReplacer() needs to be stripslashed! PHP中存在一个小错误termReplacer()$found参数需要被剥离!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM