简体   繁体   English

Javascript匹配正则表达式

[英]Javascript match regexp

I am building an javascript application for which I need to know the html tags that belong to an user selection and then for easy use put them in an array. 我正在构建一个javascript应用程序,我需要为其了解属于用户选择的html标记,然后为了易于使用将它们放置在数组中。

I used htmlText which gave me a string that looks something like this: 我使用了htmlText ,它给了我一个看起来像这样的字符串:

<h1><span style="color: rgb(102, 51, 153); font-weight: bold; text-decoration: underline;"><sub>test</sub></span></h1>

Since I have hardly any knowledge of regular expressions and what I know just doesn't seem to do what I want, I was hoping one of you guys could help me on this part. 由于我对正则表达式几乎一无所知,而且我所知道的似乎并没有满足我的要求,因此我希望你们中的一个可以在这方面帮助我。

So what is the best way to make the above string look like the following array? 那么使上面的字符串看起来像下面的数组的最佳方法是什么?

<h1>,
<span style="color: rgb(102, 51, 153); font-weight: bold; text-decoration: underline;">,
<sub>

My code so far (Don't know if I am on the right track though): 到目前为止,我的代码(虽然我不知道我是否走对了):

var fullhtml = SEOM_common.range.htmlText;//Get user selection + Surrounding html tags
var tags = fullhtml.split(SEOM_common.selected_value);//Split by user selection
var tags_arr = tags[0].match(/<(.+)>/);//Create array of tags

Thanks guys for the answers and comments. 谢谢大家的回答和评论。 I managed to build the following method, which does exactly what I want. 我设法建立了以下方法,该方法正是我想要的。

find_all_parents : function(selectRange,endNode){
   var nodes = [];
    var nodes_to_go = [];
    if(selectRange.commonAncestorContainer) nodes_to_go.push(selectRange.commonAncestorContainer.parentNode);//all browsers
        else nodes_to_go.push(selectRange.parentElement());//IE<9 browsers

        var node;

        while( (node=nodes_to_go.pop()) && node.tagName.toLowerCase() != endNode){
            if(node.nodeType === 1){ //only element nodes (tags)
                nodes.push(node);
            }

            nodes_to_go.push(node.parentNode);          
        }
        return nodes;
    }

Don't use regex for this. 不要为此使用正则表达式。 Use document manipulation methods instead and fetch the tags themselves (instead of the textual representation of the tags). 请改用文档处理方法,并自己获取标签(而不是标签的文本表示形式)。

For example: 例如:

var find_all_nodes = function(rootNode){
    var nodes = [];
    var nodes_to_go = [rootNode];
    var node;
    while( (node=nodes_to_go.pop()) ){
        if(node.nodeType === 1){ //only element nodes (tags)
            nodes.push(nodes_to_go);
        }
        var cs = node.childNodes;
        for(var i=0; i<cs.length; i++){
            nodes_to_go.push(cs[i]);
        }
    }
    return nodes;
}

Once you have a tag you can get all sorts of information from it. 有了标签后,便可以从中获取各种信息。 I recomend checking out the DOM docs from MDN and the compatibility notes from Quirksmode 我建议检查MDN中的DOM文档和Quirksmode中的兼容性说明

You should not use Regex for HTML/XML parsing. 您不应该使用Regex进行HTML / XML解析。

...unless you have a good reason to do so! ...除非您有充分的理由这样做!

If so, then replace (<h1>)(<span[^>]*>)(<sub>)[^<]*</sub></span></h1> with $1,\\n$2\\n$3 . 如果是这样,则将(<h1>)(<span[^>]*>)(<sub>)[^<]*</sub></span></h1>替换$1,\\n$2\\n$3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM