简体   繁体   English

清理HTML代码

[英]Sanitizing HTML code

I have a webpage that reads data from a database and then displays it, but I've just found an issue with data that could contain HTML in it. 我有一个网页,该网页从数据库中读取数据然后显示它,但是我刚刚发现了一个可能包含HTML的数据问题。

In my database I have the following 3 entries: 在我的数据库中,我有以下3个条目:

Bob 鲍勃

Jill 吉尔

<button onClick="alert('hi')">Click me!</button>

Now I have my HTML page that gets the data and displays it and has a click event on each entry, so an example would be: 现在,我有了HTML页面,它可以获取并显示数据,并且每个条目上都有一个click事件,因此示例如下:

<div onClick="DoSomething()">
   <a>Bob</a>
</div>

My code is removing escape characters so < becomes &lt; 我的代码正在删除转义字符,因此<变为&lt;

That works fine until I get to the last entry and I end up with: 直到我进入最后一个条目,然后最终我都可以正常工作:

<div onClick="DoSomething()">
   <a>&lt;button onClick="alert('hi')"&gt;Click me too!&lt;/button&gt;</a>
</div>

It displays as expected so I would see: 它按预期显示,因此我将看到:

<button onClick="alert('hi')">Click me too!</button>

but it also is picking up that there should be a click event trying to show "hi". 但同时也发现应该有一个点击事件试图显示“ hi”。

Does anyone know how I can safely stop the onClick that is being defined by the name but still have my onClick event on the surrounding div. 有谁知道我可以安全地停止由该名称定义的onClick,但仍然在周围的div上保留了onClick事件。

I can't restrict the names that can be added to the database. 我不能限制可以添加到数据库的名称。

Figured what was happening. 知道发生了什么事。 When I create the html elements that hold the item names I was creating a custom attribute on it called DispName and putting the name of the item in it, eg: 当我创建用于保存项目名称的html元素时,我正在其上创建一个名为DispName的自定义属性,并将该项目的名称放入其中,例如:

<div onclick="DoSomething()" DispName="&lt;button onClick="alert('hi')"&gt;Click me too!&lt;/button&gt;">
    <div>&lt;button onClick="alert('hi')"&gt;Click me too!&lt;/button&gt;</div>
<div>

So when my onclick event was called I would use .getAttribute("DispName") on the entity that was clicked and assumed that I would get the value <button onClick="alert('hi')">Click me too!</button> back but what I was getting back was the unsanitized text of <button onClick="alert('hi')">Click me too!</button> . 因此,当我的onclick事件被调用时,我将在被单击的实体上使用.getAttribute(“ DispName”),并假设我也将获得<button onClick =“ alert('hi')”>也点击我的值!</ </ button>返回,但我回来的是<button onClick="alert('hi')">Click me too!</button>的未过滤文本。

I think what is happening is as soon as I add my html elements to the DOM, the attribute DispName's value is unsanitized by the DOM. 我认为正在发生的事情是,只要将html元素添加到DOM中,DOM就不会对属性DispName的值进行消毒。 The sanitized script that is not in an attribute is left as I expected. 不在属性中的已清理脚本按预期保留了。

Not sure if anyone can confirm this, but now I know why I then get the rogue script running when I try and use the value in DispName. 不知道是否有人可以确认这一点,但是现在我知道了为什么当我尝试使用DispName中的值时让流氓脚本运行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM