简体   繁体   English

用于从JS中的HTML标记中删除id,style,class属性的正则表达式

[英]A regex to remove id, style, class attributes from HTML tags in JS

I got a html String in javascript and using regex I want to remove id, style and class attributes in html tags, for example I have: 我在javascript中使用了html字符串并使用正则表达式我想删除html标签中的id,style和class属性,例如我有:

New York City.<div style="padding:20px" id="upp" class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>

I want this String to become: 我希望这个String成为:

New York City.<div><div>This message is.</div></div>

Instead of parsing the HTML using regular expressions, which is a bad idea , you could take advantage of the DOM functionality that is available in all browsers. 您可以利用所有浏览器中可用的DOM功能,而不是使用正则表达式解析HTML,这是一个坏主意 We need to be able to walk the DOM tree first: 我们需要能够首先遍历DOM树:

var walk_the_DOM = function walk(node, func) {
    func(node);
    node = node.firstChild;
    while (node) {
        walk(node, func);
        node = node.nextSibling;
    }
};

Now parse the string and manipulate the DOM: 现在解析字符串并操纵DOM:

var wrapper= document.createElement('div');
wrapper.innerHTML= '<!-- your HTML here -->';
walk_the_DOM(wrapper.firstChild, function(element) {
    if(element.removeAttribute) {
        element.removeAttribute('id');
        element.removeAttribute('style');
        element.removeAttribute('class');
    }
});
result = wrapper.innerHTML;

See also this JSFiddle . 另见这个JSFiddle

If you are willing to remove everything but the div tag names- 如果您愿意删除除div标签名称之外的所有内容 -

string=string.replace(/<(div)[^>]+>/ig,'<$1>');

This will return <DIV> if the html is upper Case. 如果html为大写,则返回<DIV>

If you just want to remove the attributes, then regex is the wrong tool. 如果您只想删除属性,那么正则表达式是错误的工具。 I'd suggest, instead: 相反,我建议:

function stripAttributes(elem){
    if (!elem) {
        return false;
    }
    else {
        var attrs = elem.attributes;
        while (attrs.length) {
            elem.removeAttribute(attrs[0].name);
        }
    }
}

var div = document.getElementById('test');

stripAttributes(div);

JS Fiddle demo . JS提琴演示

i used this 我用过这个

var html = 'New York City.<div style="padding:20px" id="upp"
class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>';

function clear_attr(str,attrs){
    var reg2 = /\s*(\w+)=\"[^\"]+\"/gm;
    var reg = /<\s*(\w+).*?>/gm;
    str = str.replace(reg,function(match, i) {
        var r_ = match.replace(reg2,function(match_, i) {
            var reg2_ = /\s*(\w+)=\"[^\"]+\"/gm;
            var m = reg2_.exec(match_);
            if(m!=null){
                if(attrs.indexOf(m[1])>=0){
                    return match_;
                }
            }
            return '';
        });        
        return r_;
    });
    return str;
}
clear_attr(html,[]);

Use regular expression. 使用正则表达式。 That is fast (in production time) and easy (in development time). 这很快(在生产时间)和容易(在开发时间)。

htmlCode = htmlCode.replace(/<([^ >]+)[^>]*>/ig,'<$1>');

Trying to parse HTML with regexes will cause problems. 尝试使用正则表达式解析HTML将导致问题。 This answer may be helpful in explaining them. 这个答案可能有助于解释它们。 If you are using jQuery, you may be able to do something like this: 如果您使用的是jQuery,您可以执行以下操作:

var transformedHtml = $(html).find("*").removeAttr("id").removeAttr("style").removeAttr("class").outerHTML()

For this to work, you need to be using the outerHTML plugin described here . 为此,您需要使用此处描述outerHTML插件

If you don't want to use jQuery, it will be trickier. 如果你不想使用jQuery,那将会更棘手。 These question may have some helpful answers as to how to convert the string to a collection of DOM elements: Converting HTML string into DOM elements? 关于如何将字符串转换为DOM元素的集合,这些问题可能有一些有用的答案: 将HTML字符串转换为DOM元素? , Creating a new DOM element from an HTML string using built-in DOM methods or prototype . 使用内置DOM方法或原型从HTML字符串创建新的DOM元素 You may be able to loop through the elements and remove the attributes using the built-in removeAttr function. 您可以使用内置的removeAttr函数遍历元素并删除属性。 I don't have the time or motivation to figure out all the details for you. 我没有时间或动力为您找出所有细节。

A plain script solution would be something like: 一个简单的脚本解决方案将是这样的:

function removeProperties(markup) {
  var div = document.createElement('div');
  div.innerHTML = markup;
  var el, els = div.getElementsByTagName('*');

  for (var i=0, iLen=els.length; i<iLen; i++) {
    el = els[i];
    el.id = '';
    el.style = '';
    el.className = '';
  }
  // now add elements to the DOM
  while (div.firstChild) {
   // someElement.appendChild(div.firstChild);
  }
}

A more general solution would get the property names as extra arguments, or say a space separated string, then iterate over the names to remove them. 更通用的解决方案是将属性名称作为额外参数,或者说空格分隔的字符串,然后迭代名称以删除它们。

I don't know about RegEx, but I sure as hell know about jQuery. 我不知道RegEx,但我肯定知道jQuery。

Convert the given HTML string into a DOM element, parse it, and return its contents. 将给定的HTML字符串转换为DOM元素,解析它并返回其内容。

function cleanStyles(html){
    var temp = $(document.createElement('div'));
        temp.html(html);

        temp.find('*').removeAttr('style');
        return temp.html();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM