简体   繁体   English

如何在 javascript 中转义一些 html?

[英]How do I escape some html in javascript?

Given the text鉴于文本

<b>This is some text</b>

I want to write it to my page so that it shows up like this:我想把它写到我的页面上,让它显示如下:

<b>This is some text</b>

and not like this而不是这样

This is some text这是一些文字

using escape("<b>This is some text</b>") gives me this lovely gem in firefox使用escape("<b>This is some text</b>")在 Firefox 中给了我这个可爱的宝石

%3Cb%3EThis%20is%20some%20text%3C/b%3E

not exaclty what I'm after.不完全是我所追求的。 Any ideas?有任何想法吗?

This should work for you: http://blog.nickburwell.com/2011/02/escape-html-tags-in-javascript.html这应该适合你: http : //blog.nickburwell.com/2011/02/escape-html-tags-in-javascript.html

function escapeHTML( string )
{
    var pre = document.createElement('pre');
    var text = document.createTextNode( string );
    pre.appendChild(text);
    return pre.innerHTML;
}

Security Warning安全警告

The function doesn't escape single and double quotes, which if used in the wrong context, may still lead to XSS.该函数不会转义单引号和双引号,如果在错误的上下文中使用,仍然可能导致 XSS。 For example:例如:

 var userWebsite = '" onmouseover="alert(\'gotcha\')" "';
 var profileLink = '<a href="' + escapeHtml(userWebsite) + '">Bob</a>';
 var div = document.getElemenetById('target');
 div.innerHtml = profileLink;
 // <a href="" onmouseover="alert('gotcha')" "">Bob</a>

Thanks to buffer for pointing out this case.感谢缓冲区指出这种情况。 Snippet taken out of this blog post .摘自这篇博文的片段。

I ended up doing this:我最终这样做了:

function escapeHTML(s) { 
    return s.replace(/&/g, '&amp;')
            .replace(/"/g, '&quot;')
            .replace(/</g, '&lt;')
            .replace(/>/g, '&gt;');
}

I like @limc's answer for situations where the HTML DOM document is available.对于 HTML DOM 文档可用的情况,我喜欢 @limc 的回答。

I like @Michele Bosi's and @Paolo's answers for non HTML DOM document environment such as Node.js.我喜欢 @Michele Bosi 和 @Paolo 对非 HTML DOM 文档环境(例如 Node.js)的回答。

@Michael Bosi's answer can be optimized by removing the need to call replace 4 times with a single invocation of replace combined with a clever replacer function: @Michael Bosi 的答案可以通过取消调用 replace 4 次的需要来优化,并且一次调用 replace 并结合了一个巧妙的替换函数:

 function escape(s) { let lookup = { '&': "&amp;", '"': "&quot;", '\\'': "&apos;", '<': "&lt;", '>': "&gt;" }; return s.replace( /[&"'<>]/g, c => lookup[c] ); } console.log(escape("<b>This is 'some' text.</b>"));

@Paolo's range test can be optimized with a well chosen regex and the for loop can be eliminated by using a replacer function: @Paolo 的范围测试可以使用精心选择的正则表达式进行优化,并且可以使用替换函数来消除 for 循环:

 function escape(s) { return s.replace( /[^0-9A-Za-z ]/g, c => "&#" + c.charCodeAt(0) + ";" ); } console.log(escape("<b>This is 'some' text</b>"));

As @Paolo indicated, this strategy will work for more scenarios.正如@Paolo 指出的那样,此策略适用于更多场景。

Traditional Escaping传统逃生

If you're using XHTML, you'll need to use a CDATA section.如果您使用 XHTML,则需要使用CDATA部分。 You can use these in HTML, too, but HTML isn't as strict.您也可以在 HTML 中使用这些,但 HTML 没有那么严格。

I split up the string constants so that this code will work inline on XHTML within CDATA blocks.我拆分了字符串常量,以便此代码可以在 CDATA 块内的 XHTML 上内联工作。 If you are sourcing your JavaScript as separate files, then you don't need to bother with that.如果您将 JavaScript 作为单独的文件进行采购,则无需为此烦恼。 Note that if you are using XHTML with inline JavaScript, then you need to enclose your code in a CDATA block, or some of this will not work.请注意,如果您使用的是XHTML与内嵌的JavaScript,那么你需要在CDATA块附上你的代码,或一些这将无法工作。 You will run into odd, subtle errors.你会遇到奇怪的、微妙的错误。

function htmlentities(text) {
    var escaped = text.replace(/\]\]>/g, ']]' + '>]]&gt;<' + '![CDATA[');
    return '<' + '![CDATA[' + escaped + ']]' + '>';
}

DOM Text Node DOM 文本节点

The "proper" way to escape text is to use the DOM function document.createTextNode .转义文本的“正确”方法是使用 DOM 函数document.createTextNode This doesn't actually escape the text;这实际上并没有逃避文本; it just tells the browser to create a text element, which is inherently unparsed.它只是告诉浏览器创建一个文本元素,它本质上是未解析的。 You have to be willing to use the DOM for this method to work, however: that is, you have use methods such as appendChild , as opposed to the innerHTML property and similar.但是,您必须愿意使用 DOM 才能使此方法工作:也就是说,您必须使用诸如appendChild方法,而不是innerHTML属性等。 This would fill an element with ID an-element with text, which would not be parsed as (X)HTML:这将使用文本填充 ID 为an-element ,该文本不会被解析为 (X)HTML:

var textNode = document.createTextNode("<strong>This won't be bold.  The tags " +
    "will be visible.</strong>");
document.getElementById('an-element').appendChild(textNode);

jQuery DOM Wrapper jQuery DOM 包装器

jQuery provides a handy wrapper for createTextNode named text . jQuery 为名为text createTextNode提供了一个方便的包装器。 It's quite convenient.这很方便。 Here's the same functionality using jQuery:这是使用 jQuery 的相同功能:

$('#an-element').text("<strong>This won't be bold.  The tags will be " +
    "visible.</strong>");

Try this htmlentities for javascript试试这个htmlentities for javascript

function htmlEntities(str) {
    return String(str).replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/>/g, '&gt;').replace(/"/g, '&quot;');
}

You can encode all characters in your string:您可以对字符串中的所有字符进行编码:

function encode(e){return e.replace(/[^]/g,function(e){return"&#"+e.charCodeAt(0)+";"})}

Or just target the main characters to worry about (&, inebreaks, <, >, " and ') like:或者只针对需要担心的主要角色 (&、inebreaks、<、>、" 和 '),例如:

 function encode(r){ return r.replace(/[\\x26\\x0A\\<>'"]/g,function(r){return"&#"+r.charCodeAt(0)+";"}) } test.value=encode('Encode HTML entities!\\n\\n"Safe" escape <script id=\\'\\'> & useful in <pre> tags!'); testing.innerHTML=test.value; /************* * \\x26 is &ampersand (it has to be first), * \\x0A is newline, *************/
 <textarea id=test rows="9" cols="55"></textarea> <div id="testing">www.WHAK.com</div>

Here's a function that replaces angle brackets with their html entities.这是一个用 html 实体替换尖括号的函数。 You might want to expand it to include other characters too.您可能还想扩展它以包含其他字符。

function htmlEntities( html ) {
    html = html.replace( /[<>]/g, function( match ) {
        if( match === '<' ) return '&lt;';
        else return '&gt;';
    });
    return html;
}

console.log( htmlEntities( '<b>replaced</b>' ) ); // &lt;b&gt;replaced&lt;/b&gt;

I use the following function that escapes every character with the &# nnn ;我使用以下函数用&# nnn转义每个字符 notation except az AZ 0-9 and spaceaz AZ 0-9空格外的符号

function Escape( s )
{
    var h,
        i,
        n,
        c;

    n = s.length;
    h = '';

    for( i = 0; i < n; i++ )
    {
        c = s.charCodeAt( i );
        if( ( c >= 48 && c <= 57 ) 
          ||( c >= 65 && c <= 90 ) 
          ||( c >= 97 && c <=122 )
          ||( c == 32 ) )
        {
            h += String.fromCharCode( c );
        }
        else
        {
            h += '&#' + c + ';';
        }
    }

    return h;
}

Example:例子:

Escape('<b>This is some text</b>')

returns返回

&#60;b&#62;This is some text&#60;&#47;b&#62;

The function is code injection attacks proof, unicode proof, pure JavaScript.功能是代码注入攻击证明,unicode证明,纯JavaScript。

This approach is about 50 times slower than the one that creates the DOM text node but still the funcion escapes a one milion (1,000,000) characters string in 100-150 milliseconds.这种方法比创建 DOM 文本节点的方法约 50 倍,该方法仍会在 100-150 毫秒内转义一百万 (1,000,000) 个字符串。

(Tested on early 2011 MacBook Pro - Safari 9 - Mavericks) (在 2011 年初的 MacBook Pro - Safari 9 - Mavericks 上测试)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM