简体   繁体   English

在字符串开头匹配字符,忽略html标签中的字符串

[英]Match characters at start of string, ignore strings in html tags

A little help required please... 需要一点帮助...

I have a regular expression that matches characters at the start of a string as follows: 我有一个正则表达式,它匹配字符串开头的字符,如下所示:

If I have a set of strings like so: 如果我有一组像这样的字符串:

Ray Fox 
Foster Joe
Finding Forrester

REGEX 正则表达式

/\bfo[^\b]*?\b/gi 

This will match 'FO' in Fox, Foster, and Forrester as expected: 这将按预期匹配Fox,Foster和Forrester中的“ FO”:

However, I am faced with an issue where if the set of strings are wrapped in html tags like so;- 但是,我面临一个问题,如果将字符串集像这样用html标签包装;-

<span class="fontColor1">Ray Fox</span>
<span class="fontColor2">Foster Joe</span>
<span class="fontColor3">Finding Forrester</span>

This will match 'FO' in fontColor* as well. 这也将匹配fontColor *中的'FO'。

I'm fairly green with Regular expressions, I need a little help updating the query so that it only searches values between HTML tags where HTML tags exist, but still works correctly if HTML tags do not exist. 我对正则表达式相当满意,我需要一点帮助来更新查询,以便它仅在存在HTML标记的HTML标记之间搜索值,但是如果不存在HTML标记,则仍然可以正常工作。

What about 关于什么

<.*?span.*?>(.*?)<\s?\/.*?span.*?>

And where do you have text where html tags don't exist? 在哪里有html标记不存在的文本? That makes no sense. 这是没有意义的。

EDIT: 编辑:

This solution will not match nested tags, but as the question is written, that doesn't seem to be an issue. 此解决方案将不匹配嵌套标签,但是在编写问题时,这似乎不是问题。

You can use a html parser and extract pure text, and match that. 您可以使用html解析器提取纯文本,然后进行匹配。

var root;

try {
    root = document.implementation.createHTMLDocument("").body;
}
catch(e) {
    root = document.createElement("body");
}

root.innerHTML = '<span class="fontColor1">Ray Fox</span>\
            <span class="fontColor2">Foster Joe</span>\
            <span class="fontColor3">Finding Forrester</span>';

//If you are using jQuery
var text = $(root).text();

//Proceed as normal with the text variable

If you are not using jQuery, you can replace $(root).text() with findText(root) , where findText : 如果您不使用jQuery,则可以将$(root).text()替换$(root).text() findText(root) ,其中findText

function findText(root) {
    var ret = "",
        nodes = root.childNodes;
    for (var i = 0; i < nodes.length; ++i) {
        if (nodes[i].nodeType === 3) {
            ret += nodes[i].nodeValue;
        } else if (nodes[i].nodeType === 1) {
            ret += findText(nodes[i]);
        }
    }
    return ret;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM