简体   繁体   English

正则表达式:仅在标记中匹配文本节点

[英]Regular expression: Match text node only in a tag

I've been working on a highlight script. 我一直在制作一个高亮剧本。 The first result can be found here substring selector with jquery? 第一个结果可以在这里找到带有jquery的子串选择器?

The script http://jsfiddle.net/TPg9p/3/ 脚本http://jsfiddle.net/TPg9p/3/

But unfortunately it only works with a simple string. 但不幸的是它只适用于简单的字符串。 I want it to work with string that contain tags inside. 我希望它与包含标签的字符串一起使用。

Example : 示例:

<li>sample string li span style="color:red" id 
    <span id="toto" style="color:red">color id</span> 
    abcde
</li>

So if the user search for span it should only match the span inside the <li> and before the tag span but not the tag span itself. 因此,如果用户搜索span它应该只匹配<li>内部和标记span之前的span而不是标记span本身。 Then the matched string is replace with <span class="highlight">span</span> The same for other attributes or content of an attributes. 然后匹配的字符串替换为<span class="highlight">span</span>对于属性的其他属性或内容也是如此。 Anything inside an opening tag and end tag should be ignored. 应忽略开始标记和结束标记内的任何内容。

Since HTML is about DOM and nodes. 因为HTML是关于DOM和节点的。 Could we parse this string into nodes then select only the text node to replace it? 我们可以将此字符串解析为节点,然后只选择文本节点来替换它吗?

Please answer by updating the jsFiddle above. 请通过更新上面的jsFiddle来回答。

UPDATED 更新

Demo of working solution by Tibos : http://jsfiddle.net/TPg9p/10/ Tibos的工作解决方案演示: http//jsfiddle.net/TPg9p/10/

Disclaimer: You should use a HTML parser instead of regexp here. 免责声明:您应该在此处使用HTML解析器而不是regexp。

The regular expression you are looking for is this one: 您正在寻找的正则表达式是这样的:

/span(?=[^>]*<)/

Example usage: 用法示例:

var str = '<li>sample string li span style="color:red" id ' + 
    '<span id="toto" style="color:red">color id</span> ' +
    'abcde' +
    '</li>';
var keyword = 'span';
var regexp = new RegExp(keyword + '(?=[^>]*<)');
str.replace(regexp, '<span class="highlight">$&</span>');

The regexp matches your word when it is followed by a < before a > . 当正则跟随<之前>时,正则表达式匹配您的单词。

EDIT: Seeing how you don't have valid HTML (doesn't start with a tag, end with a tag), you can change your regular expression to also check for the end of the string rather than the begining of a tag: 编辑:看看你没有有效的HTML(不是以标签开头,以标签结尾),你可以改变你的正则表达式来检查字符串的结尾而不是标签的开头:

/span(?=[^>]*(?:<|$))/

DEMO: http://jsfiddle.net/TPg9p/8/ 演示: http//jsfiddle.net/TPg9p/8/

EDIT: Added regexp escaping: .replace(/[-\\/\\\\^$*+?.()|[\\]{}]/g, '\\\\$&') Curtesy of this answer: Is there a RegExp.escape function in Javascript? 编辑:添加了正则表达式转义: .replace(/[-\\/\\\\^$*+?.()|[\\]{}]/g, '\\\\$&')这个答案简明扼要: 是否有RegExp Javascript中的.escape函数?

Instead of attempting to get the correct string with the regexes, work with textNodes only: 而不是尝试使用正则表达式获取正确的字符串,而不是仅使用textNodes:

$('#submit').click(function () {
    var replacePattern = new RegExp(
        $('#search').val().replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), 
        'gi');
    $('#sample').children().addBack().not('.highlight')
      .contents().filter(function() {
        return this.nodeType === 3;
    }).replaceWith(function(){
        return this.data.replace(replacePattern, '<b class="highlight">$&</b>');
    });
});

Demo . 演示

Explanation: first you collect the #sample element and its descendants (direct only, if children() is used; it's possible to use find(*) as well, of course). 说明:首先收集#sample元素及其后代(如果使用children() ,则仅直接使用;当然也可以使用find(*) )。 Then .highlight elements are filtered out of that selection - it's obviously optional, but it made little sense for me to highlight within something that's already highlighted. 然后.highlight元素从该选择中被过滤掉 - 它显然是可选的,但是对于我在已经突出显示的内容中突出显示它没有任何意义。

After you have all the elements (to be processed), you collect all their children with .contents() - and filter the collection (with nodeType check) so that only text nodes remain there. 之后,你有所有元素(待处理),你把他们所有孩子.contents() -和过滤收集(与节点类型检查),以便只有文本节点留在那里。 Finally, you run .replaceWith() over that collection. 最后,在该集合上运行.replaceWith()

Note that the pattern definition is placed outside of the replaceWith callback function (as it basically should be a constant value during a single click handling). 请注意,模式定义位于replaceWith回调函数之外(因为在单击处理期间它基本上应该是一个常量值)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM