[英]I want to replace all occurences of a string which are not enclosed within < … >
I want to do server-side highlighting of searched words, by processing the output before sending it out. 我想通过在发送输出之前处理输出来对搜索到的单词进行服务器端突出显示。
The reasoning behind "server-side" is: “服务器端”背后的原因是:
1) Javascript highlighting for unicode text sucks. 1)Java高亮显示unicode文本很烂。 2) \\b not working with unicode (at least in JS, AFAIK).
2)\\ b不适用于unicode(至少在JS,AFAIK中)。 3) No lookbehind support in JS.
3)JS中没有向后支持。
I was using the function below, but last night realized that the first part which was written to skip <...> is not working. 我正在使用下面的函数,但是昨晚意识到编写跳过<...>的第一部分不起作用。
public function ss_highlight($terms, $buf)
{
if (empty($terms)) {
return $buf;;
}
/* sort before using length for better match */
usort($terms, function($a, $b) {
return mb_strlen($b) - mb_strlen($a);
});
$str_terms = implode('|', $terms);
/* server side highlighter */
$buf = preg_replace('/(<[^>]+>)*(?<=[\s|:|\-|>|\(|\)|\.|,|\/|^])('.$str_terms.')(?=[\s|:|\-|<|\(|\)|\.|,|\/]|$)/i', '$1<span class="highlight">$2</span>', $buf);
return $buf;
}
Any ideas would be appreciated. 任何想法,将不胜感激。
Regards. 问候。
PS: I saw some similar things in Replacing all occurences of a specific word which are not enclosed with the words OPEN and CLOSE? PS:我在替换所有出现的特定单词时看到了类似的内容, 这些单词没有用OPEN和CLOSE括起来吗? but cannot figure out how to fit this to my requirements.
但无法弄清楚如何使其符合我的要求。
DO NOT try to parse HTML with regular expressions! 请勿尝试使用正则表达式解析HTML! Use a HTML Parser!
使用HTML解析器!
See Highlight Search Terms in PHP without breaking anchor tags using regex and RegEx match open tags except XHTML self-contained tags 请参阅突出显示PHP中的搜索词,而不使用regex和RegEx匹配开放标签 破坏锚标签 , XHTML自包含标签除外
Actually, everybody knows using regex for HTML is a bad idea, but in this case, we really do not need the DOM, because we just want to replace some text aoccuring outside of any < ... >. 实际上,每个人都知道将正则表达式用于HTML是个坏主意,但是在这种情况下,我们确实不需要DOM,因为我们只想替换出现在<...>之外的某些文本。
This solution seems to work fine for me: 这个解决方案对我来说似乎很好:
public function ss_highlight($terms, $buf)
{
if (empty($terms)) {
return $buf;;
}
/* sort before using length for better match */
usort($terms, function($a, $b) {
return mb_strlen($b) - mb_strlen($a);
});
$str_terms = implode('|', $terms);
/* server side highlighter */
$buf = preg_replace_callback('#((?:(?!<[/a-z]).)*)([^>]*>|$)#si',
function ($matches) use ($str_terms) {
//return preg_replace('/(?<=[\s:\-\>\(\)\.,\/^])('.$str_terms.')(?=[\s:\-\<\(\)\.,\/]|$)/i', '<span class="highlight">$1</span>', $matches[1]).$matches[2];
return preg_replace('/(?<!\pL)('.$str_terms.')(?!\pL)/i', '<span class="highlight">$1</span>', $matches[1]).$matches[2];
}, $buf);
return $buf;
}
Thanks to everybody. 感谢大家。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.