简体   繁体   English

从URL删除Javascript

[英]Remove Javascript From A URL

I'm writing a sever-side script that replaces all URLs in a body of text with <a/> tag versions (so they can be clicked). 我正在编写一个服务器端脚本,该脚本用<a/>标签版本替换文本正文中的所有URL(以便可以单击它们)。

How can I make sure that any urls I convert do not contain any XSS style javascripts in them? 如何确定我转换的网址中不包含任何XSS样式的javascript?

I'm currently filtering for "javascript:" in the string, but I feel that is likely not sufficient.. 我目前正在过滤字符串中的“ javascript:”,但我觉得这可能还不够。

Any modern server-side language has some sort of implementation of Markdown or other lightweight markup languages. 任何现代的服务器端语言都具有Markdown或其他轻量级标记语言的某种实现。 Those markup languages replace URLs with a clickable link. 这些标记语言用可单击的链接替换URL。

Unless you have a lot of time to spend to research about this topic and implement this script, I'd suggest to spot the best Markdown implementation in your language and dig its code, or simply use it in your code. 除非您有很多时间花在研究这个主题和实现此脚本上,否则我建议您找到用您的语言编写的最佳Markdown实现并挖掘其代码,或者只是在您的代码中使用它。

Markdown is usually shipped as a library; Markdown通常以库的形式提供; some of them let you configure what they have to process and what they have to ignore – in your case you want to process URL, ignoring any other element. 其中一些可以让您配置必须处理的内容以及必须忽略的内容–在您的情况下,您要处理URL,而忽略其他任何元素。

Here's an (incomplete) list of solid Markdown implementations for different languages: 这是针对不同语言的可靠Markdown实现的(不完整)列表:

You need to attribute-encode the URLs. 您需要对URL进行属性编码。
You should also make sure that they start with http:// or https:// . 您还应确保它们以http://https://开头。

This was taken from Kohana framework, related to XSS filtering. 这取自Kohana框架,与XSS过滤有关。 Not a complete answer, but might get you on the way. 这不是一个完整的答案,但可能会助您一臂之力。

// Remove javascript: and vbscript: protocols
$str = preg_replace('#([a-z]*)[\x00-\x20]*=[\x00-\x20]*([`\'"]*)[\x00-\x20]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2nojavascript...', $str);
$str = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2novbscript...', $str);
$str = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*-moz-binding[\x00-\x20]*:#u', '$1=$2nomozbinding...', $str);

// Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?expression[\x00-\x20]*\([^>]*+>#is', '$1>', $str);
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?behaviour[\x00-\x20]*\([^>]*+>#is', '$1>', $str);
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:*[^>]*+>#ius', '$1>', $str);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM