简体   繁体   English

用PHP替换带标记链接的单词

[英]Replacing words with tag links in PHP

I have a text ( $text ) and an array of words ( $tags ). 我有一个文本( $text )和一个单词数组( $tags )。 These words in the text should be replaced with links to other pages so they don't break the existing links in the text. 文本中的这些单词应该替换为指向其他页面的链接,这样它们就不会破坏文本中的现有链接。 In CakePHP there is a method in TextHelper for doing this but it is corrupted and it breaks the existing HTML links in the text. 在CakePHP中,TextHelper中有一个方法可以执行此操作,但它已损坏,并且会破坏文本中现有的HTML链接。 The method suppose to work like this: 该方法假设像这样工作:

$text=Text->highlight($text,$tags,'<a href="/tags/\1">\1</a>',1);

Below there is existing code in CakePHP TextHelper: 下面是CakePHP TextHelper中的现有代码:

function highlight($text, $phrase, $highlighter = '<span class="highlight">\1</span>', $considerHtml = false) {
  if (empty($phrase)) {
    return $text;
  }

  if (is_array($phrase)) {
    $replace = array();
    $with = array();

    foreach ($phrase as $key => $value) {
      $key = $value;
      $value = $highlighter;
      $key = '(' . $key . ')';
      if ($considerHtml) {
        $key = '(?![^<]+>)' . $key . '(?![^<]+>)';
      }
      $replace[] = '|' . $key . '|ix';
      $with[] = empty($value) ? $highlighter : $value;
    }
    return preg_replace($replace, $with, $text);
  } else {
    $phrase = '(' . $phrase . ')';
    if ($considerHtml) {
      $phrase = '(?![^<]+>)' . $phrase . '(?![^<]+>)';
    }

    return preg_replace('|'.$phrase.'|i', $highlighter, $text);
  }
}

You can see (and run) this algorithm here: 您可以在此处查看(并运行)此算法:

http://www.exorithm.com/algorithm/view/highlight http://www.exorithm.com/algorithm/view/highlight

It can be made a little better and simpler with a few changes, but it still isn't perfect. 只需进行一些更改,它就可以变得更好更简单,但它仍然不完美。 Though less efficient, I'd recommend one of Ben Doom's solutions. 虽然效率较低,但我推荐一款Ben Doom的解决方案。

Replacing text in HTML is fundamentally different than replacing plain text. 替换HTML中的文本与替换纯文本根本不同。 To determine whether text is part of an HTML tag requires you to find all the tags in order not to consider them. 要确定文本是否是HTML标记的一部分,您需要查找所有标记以便不考虑它们。 Regex is not really the tool for this. 正则表达式并不是真正的工具。

I would attempt one of the following solutions: 我会尝试以下解决方案之一:

  • Find the positions of all the words. 找到所有单词的位置。 Working from last to first, determine if each is part of a tag. 从最后到第一个工作,确定每个是否是标记的一部分。 If not, add the anchor. 如果没有,请添加锚点。
  • Split the string into blocks. 将字符串拆分为块。 Each block is either a tag or plain text. 每个块都是标签或纯文本。 Run your replacement(s) on the plain text blocks, and re-assemble. 在纯文本块上运行替换,然后重新组装。

I think the first one is probably a bit more efficient, but more prone to programmer error, so I'll leave it up to you. 我认为第一个可能更高效,但更容易出现程序员错误,所以我会留给你。

If you want to know why I'm not approaching this problem directly, look at all the questions on the site about regex and HTML, and how regex is not a parser. 如果您想知道我为什么不直接解决这个问题,请查看网站上有关正则表达式和HTML的所有问题,以及正则表达式如何不是解析器。

This code works just fine. 这段代码工作得很好。 What you may need to do is check the CSS for the <span class="highlight"> and make sure it is set to some color that will allow you to distinguish that it is high lighted. 您可能需要做的是检查CSS的<span class="highlight">并确保将其设置为某种颜色,以便您区分它是高亮的。

.highlight { background-color: #FFE900; }

Amorphous - I noticed Gert edited your post. 非晶 - 我注意到格特编辑了你的帖子。 Are the two code fragments exactly as you posted them? 这两个代码片段与发布的完全一样吗?

So even though the original code was designed for highlighting, I understand you're trying to repurpose it for generating links - it should, and does work fine for that (tested as posted). 因此,即使原始代码是为突出显示而设计的,但我知道您正在尝试将其重新用于生成链接 - 它应该,并且确实可以正常工作(按发布测试)。

HOWEVER escaping in the first code fragment could be an issue. 然而 ,在第一个代码片段中转义可能是一个问题。

$text=Text->highlight($text,$tags,'<a href="/tags/\1">\1</a>',1);

Works fine... but if you use speach marks rather than quote marks the backslashes disappear as escape marks - you need to escape them. 工作得很好......但如果你使用说话标记而不是引号,则反斜杠会作为逃避标记消失 - 你需要逃脱它们。 If you don't you get %01 links. 如果你不这样做,你会得到%01链接。

The correct way with speach marks is: 讲话标记的正确方法是:

$text=Text->highlight($text,$tags,"<a href=\"/tags/\\1\">\\1</a>",1);

(Notice the use of \\1 instead of \\1) (注意使用\\ 1而不是\\ 1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM