简体   繁体   English

JavaScript 正则表达式 去除没有属性的 html 标签

[英]JavaScript Regex To strip the html Tag with no attribute

I have an Html String like <font>New</font><font face="Akronim---Regular" color="#00ff00">Text</font> , With the use of JavaScript Regex i wanted to remove font tags with no attributes.我有一个 Html 字符串,例如<font>New</font><font face="Akronim---Regular" color="#00ff00">Text</font> ,使用 JavaScript 正则表达式我想删除字体标签没有属性。

So the output of the above html string should be New<font face="Akronim---Regular" color="#00ff00">Text</font>所以上面 html 字符串的 output 应该是New<font face="Akronim---Regular" color="#00ff00">Text</font>

Below is the code, which helps me to strip all the font tags, But i am only require to strip font tags with no attribute.下面是代码,它可以帮助我去除所有字体标签,但我只需要去除没有属性的字体标签。

 var replace = new RegExp('<'+'font'+'[^><]*>|<.'+'font'+'[^><]*>','g') var text = '<font>New</font><font face="Akronim---Regular" color="#00ff00">Text</font>'; console.log(text.replace(replace, ''))

Thanks in advance.提前致谢。

Have a look at this.看看这个。 Do not use RegEx to manipulate HTML不要使用 RegEx 来操作 HTML

The container could be created in memory if needed如果需要,可以在 memory 中创建容器

On Node you can use https://www.npmjs.com/package/jsdom在节点上,您可以使用https://www.npmjs.com/package/jsdom

 document.querySelectorAll("font").forEach(f => { const parent = f.parentNode; if (f.attributes.length === 0) { if (f.innerHTML === "") { // remove empty fonts - we could do this before too parent.removeChild(f); } else { f.childNodes.forEach(child => parent.insertBefore(child.cloneNode(true), f)); parent.removeChild(f); } } }); console.log(document.getElementById("container").innerHTML)
 <div id="container"> <font>Hello <font face="Akronim---Regular">world</font> </font> <font></font> <font>New</font> <font face="Akronim---Regular" color="#00ff00">Text</font> </div>

If you want to do it with a regex, you'll have to go tag by tag while keeping track of nesting levels so that you know when to remove a closing tag and when not.如果您想使用正则表达式,则必须逐个标记 go 标记,同时跟踪嵌套级别,以便您知道何时删除结束标记,何时不删除。

To do so simply use an array that you constantly push / pop to/from it the type of the tag you encounter.为此,只需使用一个数组,您不断地向其中push / pop您遇到的标签类型。 When you encounter an opening tag, you push true if it has attributes and false if it doesn't, you then remove it if it doesn't have attributes.当你遇到一个开始标签时,如果它有属性则push true ,如果没有则推送false ,如果它没有属性,则将其删除。 When you encounter a closing tag, you pop the type of the last encountered opening tag, if it had attributes ( true ), you skip to the next one, if it didn't have them ( false ) you remove it.当你遇到一个结束标签时,你pop最后遇到的开始标签的类型,如果它有属性( true ),你跳到下一个,如果它没有它们( false )你删除它。

The regex should go over the opening and the closing tags in one run while giving us info about whether its a closing or an opening one and whether it has attributes or not.正则表达式应该 go 在一次运行中覆盖开始和结束标签,同时向我们提供关于它是结束标签还是开始标签以及它是否具有属性的信息。 To do so we use regex like so <\/?font [^\s]*?> , we group (\/) and ([^\s]*?) because whether or not those groups get matched, we will know if it is a closing tag or not and if it has attributes or not respectively (for example, if we match the / then it's a closing tag).为此,我们像这样使用正则表达式<\/?font [^\s]*?> ,我们将(\/)([^\s]*?)分组,因为这些组是否匹配,我们会知道它是否是一个结束标签,以及它是否分别具有属性(例如,如果我们匹配/那么它是一个结束标签)。 We add in the \s* to handle empty spaces and the resulting regex is /<(\/)?\s*font\s*([^\s]*?)\s*>/g .我们添加\s*来处理空格,得到的正则表达式是/<(\/)?\s*font\s*([^\s]*?)\s*>/g

Here is the function that does the job:这是完成这项工作的 function:

function stripEmptyFonts(htmlString) {
   var tagTypes = [];

   return htmlString.replace(/<(\/)?\s*font\s*([^\s]*?)\s*>/g, function(match, closingSlash, attributes) {
      if(!closingSlash) {                                  // if it is an opening tag (no '/' was matched)
         tagTypes.push(!!attributes);                      // push true to the array tagTypes if it has attributes, otherwise push false (attributes will either be a string or null, we use the double negation !! to convert it to a boolean)
         return attributes ? match : "";                   // remove it if it has no attributes, otherwise keep it as is (read the docs of String#replace method)
      } else {                                             // if it is a closing tag (a '/' was matched)
         return tagTypes.pop() ? match : "";               // if the last tag we encounterd had attributes (pop returned true) we skip this closing tag, otherwise (pop return false) we remove it
      }
   });
}

Example:例子:

 function stripEmptyFonts(htmlString) { var tagTypes = []; return htmlString.replace(/<(\/)?\s*font\s*([^\s]*?)\s*>/g, function(match, closingSlash, attributes) { if(.closingSlash) { tagTypes;push(?:attributes); return attributes. match? "": } else { return tagTypes;pop(); match. ""; } }); } var html = ` <font>New</font> <font color="red">Hello <font>world</font>!</font> <font>Hello <font color="blue">back</font>!</font> <font>ABCD<font>EFGH<font color="black">IJKL<font>MNOP<font color="red">QRST</font>UVWX</font>YZ</font>1234</font>5678</font>` console.log(stripEmptyFonts(html));

if you have no access of DOM element then you can try this answer如果您无法访问 DOM 元素,那么您可以尝试这个答案

<script>
let htmlString = '<font>Hello </font><font>New <font face="Akronim-Regular">world</font></font>';
let expoString = htmlString.split('<font>');
expoString = expoString.filter(function(el) {
  return el != null && el != "";
});
for (let i = 0; i < expoString.length; i++) {
  let startTag = expoString[i].split('<font').length - 1;
  let endTag = expoString[i].split('</font>').length - 1;
  for (let j = 1; j <= endTag - startTag; j++) {
    expoString[i] = expoString[i].replace('</font>', '');
  }
}
console.log(expoString.join('')); // here you can return string instead

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM