简体   繁体   English

PHP:urlencode()是一种在URL中允许有效UTF-8字符串的安全方法吗?

[英]PHP: is urlencode() a safe way to allow valid UTF-8 strings in the URL?

I have user submitted tags that can be any type of (valid) UTF-8 string. 我有用户提交的标签,可以是任何类型的(有效)UTF-8字符串。 I want to know if it is safe to include them in the URL merly by running them through urlencode() . 我想知道通过urlencode()运行它们是否可以安全地将它们包含在URL中。

In other words, is urlencode() safe to use for valid UTF-8 strings? 换句话说,urlencode()是否可以安全地用于有效的UTF-8字符串? (by valid I mean id have already force-encoded them to UTF-8) (通过有效我的意思是id已经强制编码为UTF-8)

urlencode does not depend on a specific character encoding. urlencode不依赖于特定的字符编码。 It just looks at the bytes, interprets them as ASCII characters and replaces any byte that is either not allowed in ASCII (0x80–0xFF) or not allowed in plain in a URL. 它只查看字节,将它们解释为ASCII字符,并替换ASCII中不允许的任何字节(0x80-0xFF)或URL中不允许的字节。

Now to your question: Yes, using urlencode does encode any string in any character encoding to be safely used – but only in the URL query! 现在回答您的问题:是的,使用urlencode会对任何字符编码中的任何字符串进行编码以便安全使用 - 但仅限于URL查询! Because urlencode formats the input according to application/x-www-form-urlencoded that differs from the “normal” percent encoding in how the space is encoded: In application/x-www-form-urlencoded spaces are replaced by + while the “normal” percent encoding replaces them by %20 . 因为urlencode根据application / x-www-form- urlencode格式化输入,这与编码空间的“正常” 百分比编码不同:在application / x-www-form-urlencoded中,空格被+替换为“普通的“百分比编码”将它们替换为%20

If you want to “normal” percent encoding use rawurlencode instead. 如果你想要“正常”百分比编码,请改用rawurlencode

Yes, urlencode() should make a safe URL string out of any input string. 是的, urlencode()应该从任何输入字符串中创建一个安全的URL字符串。 As long as whatever that URL is mapping to (folder/file/htaccess) , doesn't have funky characters in it. 只要该URL映射到(folder/file/htaccess) ,其中没有任何时髦的字符。 Whenever sanitizing stuff from a user where they could be posting something funky I love this function: 每当从用户那里清理东西时,他们可以发布一些时髦的东西,我喜欢这个功能:

utf8_encode()

Just to be entirely on the safe side, I would remove newlines first. 为了完全放在安全的一边,我会首先删除换行符。 They are not dangerous in themselves, but they can be stepping stones in exploiting other vulnerabilities. 它们本身并不危险,但它们可以成为利用其他漏洞的垫脚石。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM