简体   繁体   English

如何使用javascript中的正则表达式从所有标签中删除特定的HTML属性?

[英]How to remove specific HTML attribute from all tags using regular expressions in javascript?

I have very large HTML that, if being parsed into DOM tree, would take much time, so this option despite being "proper" is not available . 我有非常大的HTML,如果被解析为DOM树,将花费很多时间,所以这个选项尽管是“正确的” 是不可用的 I need to remove all the inside-tag style declarations. 我需要删除所有内部标记样式声明。

There is a regular expression that seem to work in most cases: 在大多数情况下,有一个似乎有效的正则表达式:

> re
/\sstyle\s*=(\"[^\">]*\"*|\'[^\'>]*\'*|[^\s>]*)/gi
> test
[ '<img src="some.jpg" style="width:auto" width="50" height="60">',
  '<img style=\'width:auto\'>',
  '<img style=\'width:auto>',
  '<img style=width:auto>',
  '<div style=\'\'>',
  '<div style=\'background-image:url(\'paper.gif\');\'',
  '<div style=\'background-image:url(\\\'paper.gif\\\');\'' ]
> test.forEach(function(t){console.log(t.replace(re,''))})
<img src="some.jpg" width="50" height="60">
<img>
<img>
<img>
<div>
<divpaper.gif');'
<divpaper.gif\');'

As you see, in case there are repeated quotes inside the value part, either with or without proper escaping, the regular expression doesn't work. 如您所见,如果值部分内部有重复引号,无论是否有正确的转义,正则表达式都不起作用。 Any ideas how I can improve it? 我有什么想法可以改进吗?

The standard way of finding an attribute would be something like / style="[^"]+"/g [demo] . 查找属性的标准方法类似于/ style="[^"]+"/g [demo]

The problem with your markup is that it's all over the place; 你的标记的问题在于它到处都是; regular expressions are awesome at finding patterns. 正则表达式在查找模式时非常棒。 There are no predictable patterns with this markup. 此标记没有可预测的模式。

Why would you want to write one big regular expression to do all of that at once? 你为什么要写一个大的正则表达式来一次完成所有这些?

Parsing it into a DOM tree might take too much time, but writing a hand-crafted parser will probably be better. 将其解析为DOM树可能需要花费太多时间,但编写一个手工制作的解析器可能会更好。

You can also mix the two: use a regular expression to isolate each and every tag (which is easy), then parse the attributes inside the tag, isolating (and removing) any style attribute you encounter. 您还可以混合使用两者:使用正则表达式隔离每个标记(这很容易),然后解析标记内的属性,隔离(并删除)您遇到的任何style属性。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用JavaScript正则表达式删除标签 - how to remove tags with JavaScript regular expressions JS:如何使用正则表达式从 HTML 字符串中删除样式标签及其内容? - JS: How to remove style tags and their content from an HTML string, using regular expressions? 如何使用JavaScript正则表达式获取html标签属性值? - How to get html tag attribute values using JavaScript Regular Expressions? Javascript 在使用正则表达式选择单词时如何过滤掉 HTML 标签? - How does Javascript filter out HTML tags while selecting words using regular expressions? Javascript - 如何从all中删除style属性 <li> 标签 - Javascript - How to remove the style attribute from all <li> tags 使用Javascript从字符串中删除特定的HTML标签 - Remove Specific HTML Tags from String with Javascript 使用javascript中的正则表达式删除所有html标签和javascript标签 - Remove all html tags and javascript tags using regex in javascript 用于从字符串中删除所有带有内容和 html 代码的标签的正则表达式 - regular expression to remove all tags with content and html code from a string 如何使用javascript正则表达式删除带有数字和特殊字符的字符串? - How to remove strings with numbers and special characters using javascript regular expressions? 如何在不使用regexp的情况下从JavaScript中的字符串中删除HTML标签? - how to remove HTML tags from a string in JavaScript without using regexp?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM