简体   繁体   English

忽略子字符串的Javascript正则表达式

[英]Javascript regular expression that ignores a substring

Background: 背景:

I found similiar SO posts on this topic, but I failed to make it work for my scenario. 我在该主题上找到了类似的SO帖子,但未能使其适合我的情况。 Appologies in advance if this is a dupe. 如果这是骗子,请提前申请。

My Intent: 我的意图:

Take every English word in a string, and convert it to a html hyperlink. 将字符串中的每个英语单词转换为html超链接。 This logic needs to ignore only the following markup: <br/> , <b> , </b> 该逻辑需要忽略以下标记: <br/><b></b>

Here's what I have so far. 到目前为止,这就是我所拥有的。 It converts English words to hyperlinks as I expect, but has no ignore logic for html tags (that's where I need your help): 正如我期望的那样,它将英语单词转换为超链接,但没有html标记的忽略逻辑(这是我需要您帮助的地方):

text = text.replace(/\b([A-Z\-a-z]+)\b/g, "<a href=\"?q=$1\">$1</a>");

Example Input / Output: 输入/输出示例:

Sample Input: 输入样例:

this <b>is</b> a test

Expected Output: 预期产量:

<a href="?q=this">this</a> <b><a href="?q=is">is</a></b> <a href="?q=a">a</a> <a href="?q=test">test</a>

Thank you. 谢谢。

Issues with regexing HTML aside, the way I'd do this is in two steps: 除了HTML的正则表达式问题外,我将通过以下两个步骤进行操作:

  • First of foremost, one way or another, extract the texts outside the tags 首先,以一种或另一种方式,提取标签外的文本
  • Then only do this transform to these texts, and leave everything else untouched 然后,仅将此转换为这些文本,并保持所有其他内容不变

Related questions 相关问题

Here's a hybrid solution that gives you the performance gain of innerHTML and the luxury of not having to mess with HTML strings when looking for the matches: 这是一个混合解决方案,可为您提供innerHTML的性能提升,以及在寻找匹配项时不必弄乱HTML字符串的奢华:

function findMatchAndReplace(node, regex, replacement) {

    var parent,
        temp = document.createElement('div'),
        next;

    if (node.nodeType === 3) {

        parent = node.parentNode;

        temp.innerHTML = node.data.replace(regex, replacement);

        while (temp.firstChild)
            parent.insertBefore(temp.firstChild, node);

        parent.removeChild(node);

    } else if (node.nodeType === 1) {

        if (node = node.firstChild) do {
            next = node.nextSibling;
            findMatchAndReplace(node, regex, replacement);
        } while (node = next);

    }

}

Input: 输入:

<div id="foo">
    this <b>is</b> a test
</div>

Process: 处理:

findMatchAndReplace(
    document.getElementById('foo'),
    /\b\w+\b/g,
    '<a href="?q=$&">$&</a>'
);

Output ( whitespace added for clarity ): 输出( 为清楚起见添加了空格 ):

<div id="foo">
    <a href="?q=this">this</a>
    <b><a href="?q=is">is</a></b>
    <a href="?q=a">a</a>
    <a href="?q=test">test</a>
</div>

Here's another JavaScript method. 这是另一个JavaScript方法。

var StrWith_WELL_FORMED_TAGS    = "This <b>is</b> a test, <br> Mr. O'Leary! <!-- What about comments? -->";
var SplitAtTags                 = StrWith_WELL_FORMED_TAGS.split (/[<>]/);
var ArrayLen                    = SplitAtTags.length;
var OutputStr                   = '';

var bStartWithTag               = StrWith_WELL_FORMED_TAGS.charAt (0) == "<";

for (var J=0;  J < ArrayLen;  J++)
{
    var bWeAreInsideTag         = (J % 2) ^ bStartWithTag;

    if (bWeAreInsideTag)
    {
        OutputStr              += '<' + SplitAtTags[J] + '>';
    }
    else
    {
        OutputStr              += SplitAtTags[J].replace (/([a-z']+)/gi, '<a href="?q=$1">$1</a>');
    }
}

//-- Replace "console.log" with "alert" if not using Firebug.
console.log (OutputStr);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM