安全地为JavaScript提供JSON和HTML

Question

I am thinking of secure ways to serve HTML and JSON to JavaScript. 我正在考虑向JavaScript提供HTML和JSON的安全方法。 Currently I am just outputting the JSON like: 目前我只是输出JSON，如：

 ajax.php?type=article&id=15

{
 "name":    "something",
 "content": "some content"
}

but I do realize this is a security risk -- because the articles are created by users. 但我确实意识到这是一个安全风险 - 因为文章是由用户创建的。 So, someone could insert script tags (just an example) for the content and link to his article directly in the AJAX API. 因此，有人可以为内容插入脚本标记（只是一个示例），并直接在AJAX API中链接到他的文章。 Thus, I am now wondering what's the best way to prevent such issues. 因此，我现在想知道防止此类问题的最佳方法是什么。 One way would be to encode all non alphanumerical characters from the input, and then decode in JavaScript (and encode again when put in somewhere). 一种方法是从输入中编码所有非字母数字字符，然后在JavaScript中解码（并在放入某处时再次编码）。

Another option could be to send some headers that force the browser to never render the response of the AJAX API requests ( Content-Type and X-Content-Type-Options ). 另一种选择可能是发送一些强制浏览器永远不会呈现AJAX API请求响应的标头（ Content-Type和X-Content-Type-Options ）。

Answer 1

If you set the Content-Type to application/json then NO Browser will execute JavaScript on that page. 如果将Content-Type设置为application/json那么NO Browser将在该页面上执行JavaScript。 This is apart of RFC-4627 , and Google uses this to protect them selves. 这是RFC-4627的一部分，Google使用它来保护自己。 Other Application/ Content types follow similar rules. 其他Application/内容类型遵循类似规则。

You still have to worry about DOM Based XSS , however this would be a problem with your JavaScript, not really the content of the json. 您仍然需要担心基于DOM的XSS ，但这会对您的JavaScript造成问题，而不是json的内容。 Another more exotic security concern with Json is information leakage like this vulnerability in gmail . Json的另一个更奇特的安全问题是像gmail中的这个漏洞一样的信息泄漏。

Make sure to always test your code. 确保始终测试您的代码。 There is the Sitewatch free xss scanner , or the open source Skipfish and finally you could test this manually with a simple <script>alert(/xss/)</script> . 有Sitewatch免费的xss扫描程序，或开源的Skipfish ，最后你可以用一个简单的<script>alert(/xss/)</script>手动测试它。

Answer 2

Instead of worrying about how you could encode the malicious code when you return it, you should probably take care that it does not even get into your database. 您可能应该注意它甚至不会进入您的数据库，而不是担心如何在返回时对恶意代码进行编码。 A quick google search about preventing cross-site scripting and input validation might help you here. 有关阻止跨站点脚本和输入验证的快速谷歌搜索可能会对您有所帮助。 Cheers 干杯

Answer 3

If the user has to be logged in to view the web page then secure the ajax.php with the same authorization mechanism. 如果用户必须登录才能查看网页，请使用相同的授权机制保护ajax.php。 Then a client that's not logged in cannot access ajax.php directly to retrieve the data. 然后，未登录的客户端无法直接访问ajax.php来检索数据。

Answer 4

I don't think your question is about validating user input, as others pointed out. 我不认为你的问题是关于验证用户输入，正如其他人指出的那样。 You don't want to provide your JSON api to other people... right? 你不想把你的JSON api提供给其他人......对吗？

If this is the case then there isn't much you can do... in fact, even if you were serving HTML instead of JSON, people would still be doing HTML scraping to get what they wanted from your site (this is how Search Engine spiders work). 如果是这种情况那么你可以做的事情就不多了......事实上，即使你提供HTML而不是JSON，人们仍然会进行HTML抓取以从你的网站获得他们想要的东西（这就是搜索方式引擎蜘蛛工作）。

A good way to prevent scraping is to allow only a specific amount of downloads from an IP address. 防止抓取的一个好方法是只允许从IP地址下载特定数量的内容。 This way if someone is requesting http://yoursite.com/somejson.json more than 100 times a day, you probably know it's a scraper, and not someone visiting your page for 100 times in 1 day. 这样，如果有人每天要求http://yoursite.com/somejson.json超过100次，你可能知道这是一个刮刀，而不是有人在一天内访问你的页面100次。

Answer 5

Insertion of script tags (or SQL) is only a problem if you fail to ensure it isn't at the point that it could be a problem. 插入脚本标记（或SQL）只是一个问题，如果你不能确保它不是它可能是一个问题。

A <script> tag in the middle of a comment that somebody submits will not hurt your server and it won't hurt your database. 有人提交的评论中间的<script>标记不会损害您的服务器，也不会损害您的数据库。 What it would hurt, if you fail to take appropriate measures, would be a page that includes the comment when you subsequently serve it up and it reaches a client browser. 会是什么伤害，如果不采取适当措施，将是一个页面，其中包括当您随后为它服务了评论，并达到客户端浏览器。 In order to prevent that from happening, your code that prepares the page must make sure that user-supplied content is always scrubbed before it is exposed to an unaware interpreter. 为了防止这种情况发生，准备页面的代码必须确保在暴露给不知情的解释器之前始终擦除用户提供的内容。 In this case, that unaware interpreter is a client web browser. 在这种情况下，该无意识的解释器是客户端Web浏览器。 In fact, your client web browser really involves two unaware interpreters: the HTML parser & layout engine and the Javascript interpreter. 事实上，您的客户端Web浏览器实际上涉及两个不知情的解释器：HTML解析器和布局引擎以及Javascript解释器。

Another important example of an unaware interpreter is your database server. 不知情的解释器的另一个重要示例是您的数据库服务器。 Note that a <script> tag is (almost certainly) harmless to your database, because "" doesn't mean anything in SQL. 请注意， <script>标记（几乎可以肯定）对数据库无害，因为“”在SQL中没有任何意义。 It's other sorts of input that cause problems for SQL, like quotes in strings (which are harmless to your HTML pages!). 它是导致SQL出现问题的其他类型的输入，比如字符串中的引号（对HTML页面无害！）。

Stackoverflow would be pretty lame if I couldn't put <script> tags in my answers, as I'm doing now. 如果我不能在我的答案中加入<script>标签，那么Stackoverflow会非常蹩脚，正如我现在所做的那样。 Same goes for examples of SQL Injection attacks. SQL注入攻击的示例也是如此。 Recently somebody linked a page from some prominent US bank, where a big <textarea> was footnoted by a warning not to include the characters "<" or ">" in whatever you typed. 最近有人联系了一家美国着名银行的一个页面，其中一个大的<textarea>被警告包含在你输入的任何内容中不包含字符“<”或“>”的脚注。 Predictably, the bank was ridiculed over hundreds of Reddit comments, and rightly so. 可以预见的是，该银行对数百条Reddit评论进行了嘲讽，这是正确的。

Exactly how you "scrub" user-supplied content depends on the unaware interpreter to which you're delivering it. 具体如何“擦除”用户提供的内容取决于您提供它的不知情的解释器。 If it's going to be dropped in the middle of HTML markup, then you have to make sure that the "<", ">", and "&" characters are all encoded as HTML entitites. 如果它将在HTML标记的中间被删除，那么您必须确保“<”，“>”和“＆”字符都被编码为HTML权限。 (You might want to do quote characters too, if the content might end up in an HTML element attribute value.) If the content is to be dropped into Javascript, however, you may not need to worry about HTML escaping, but you do need to worry about quotes, and possibly Unicode characters outside the 7-bit range. （如果内容可能最终出现在HTML元素属性值中，您可能也想引用字符。）但是，如果要将内容放入Javascript中，您可能不需要担心HTML转义，但您确实需要担心引号，以及可能在7位范围之外的Unicode字符。

Answer 6

为了从php输出安全的html，我推荐http://htmlpurifier.org/

安全地为JavaScript提供JSON和HTML

问题描述

6 个解决方案

解决方案1
6 已采纳 2010-07-24 18:37:21

解决方案2
4 2010-07-24 11:40:29

解决方案3
1 2010-07-24 14:30:06

解决方案4
0 2010-07-24 11:51:44

解决方案5
0 2010-07-24 12:54:26

解决方案6
0 2010-07-24 18:47:03

安全地为JavaScript提供JSON和HTML

问题描述

6 个解决方案

解决方案1 6 已采纳 2010-07-24 18:37:21

解决方案2 4 2010-07-24 11:40:29

解决方案3 1 2010-07-24 14:30:06

解决方案4 0 2010-07-24 11:51:44

解决方案5 0 2010-07-24 12:54:26

解决方案6 0 2010-07-24 18:47:03

解决方案1
6 已采纳 2010-07-24 18:37:21

解决方案2
4 2010-07-24 11:40:29

解决方案3
1 2010-07-24 14:30:06

解决方案4
0 2010-07-24 11:51:44

解决方案5
0 2010-07-24 12:54:26

解决方案6
0 2010-07-24 18:47:03