正则表达式有助于替换 <html> 标签

Question

I need to extend on the regex below so that it also selects <code> tags with a class, eg <code class="lol"> 我需要扩展下面的正则表达式，以便它也选择带有类的<code>标签，例如<code class =“lol”>

var text = 'This is <i>encoded text</i> but this is <b>bold</b >!';
var html = $('<div/>')
    .text(text)
    .html()
    .replace(new RegExp('&lt;(/)?(b|i|u)\\s*&gt;', 'gi'), '<$1$2>');

Can anyone please help? 有人可以帮忙吗？

I'm guessing something like <(/)?(b|i|u|code|pre)?( class="")\\\\s*> 我猜是像<(/)?(b|i|u|code|pre)?( class="")\\\\s*> ?? ??

Many thanks 非常感谢

Answer 1

Parsing html with a regex is a bad idea, see this answer . 使用正则表达式解析html是一个坏主意，请参阅此答案。

The easiest way would to simply use some of jQuery's dom manipulation functions to remove the formating. 最简单的方法是简单地使用一些jQuery的dom操作函数来删除格式化。

$('<div/>').find("b, i, code, code.lol").each(function() {
    $(this).replaceWith($(this).text());
});

Code example on jsfiddle . 关于jsfiddle的代码示例。

Answer 2

This will replace the whole tag with everything in it (including class, id, etc.): 这会将整个标记替换为其中的所有内容（包括class，id等）：

.replace(new RegExp('&lt;(/)?(b|u|i|code|pre)(.*?)&gt;', 'gim'), '<$1$2$3>');

Mathing a code tag with a class in encoded string is hard (maybe impossible), it's easy when the code tag is in a fixed format ( <code class="whatever"> ): 使用编码字符串中的类来编写代码标记很难（可能是不可能的），当代码标记采用固定格式（ <code class="whatever"> ）时很容易：

.replace(new RegExp('&lt;(?:(code\\sclass=".*?")|(/)?(b|u|i|code|pre)(?:.*?))&gt;', 'gim'), '<$1$2$3>');

Answer 3

I wouldn't use a regex for parsing markup, but if its just a string snippet, something like this would be sufficient. 我不会使用正则表达式来解析标记，但如果它只是一个字符串片段，这样的东西就足够了。 It should be noted that the regex your using is overburdened using the \\s*. 应该注意的是，你使用的正则表达式使用\\ s *负担过重。 Its optional form could go through the overhead and replace the exact same thing. 它的可选形式可以通过开销来替换完全相同的东西。 Better to use \\s+ 最好使用\\ s +

regex: <(/?(?:b|i|u)|code\\s[^>]+class\\s*=\\s*(['"]).*?\\2[^>]*?)\\s+> 正则表达式： <(/?(?:b|i|u)|code\\s[^>]+class\\s*=\\s*(['"]).*?\\2[^>]*?)\\s+>
replace: <$1> 替换： <$1>
modifiers: sgi 修饰符： sgi

<                       # < Opening markup char
   (                       # Capture group 1
       /?                        # optional element termination
       (?:                       # grouping, non-capture
          b|i|u                    # elements 'b', 'i', or 'u'
       )                         # end grouping
    |                         # OR,
       code                      # element 'code' only
       \s [^>]*                  # followed by a space and possibly any chars except '>'
       class \s* = \s*           # 'class' attribute '=' something
         (['"]) .*? \2           # value delimeter, then some possible chars, then delimeter
       [^>]*?                    # followed by possibly any chars not '>'
   )                       # End capture group 1
   \s+                     # Here need 1 or more whitespace, what is being removed
>                      # > Closing markup char

正则表达式有助于替换 <html> 标签

问题描述

3 个解决方案

解决方案1
3 2011-03-14 18:03:07

解决方案2
1 2011-03-16 16:43:32

解决方案3
0 2011-03-14 18:52:53

正则表达式有助于替换 <html> 标签

问题描述

3 个解决方案

解决方案1 3 2011-03-14 18:03:07

解决方案2 1 2011-03-16 16:43:32

解决方案3 0 2011-03-14 18:52:53

解决方案1
3 2011-03-14 18:03:07

解决方案2
1 2011-03-16 16:43:32

解决方案3
0 2011-03-14 18:52:53