[英]Download files with non-ASCII characters in the name
My website allows users to upload files with any name. 我的网站允许用户上传任何名称的文件。 Some names, of course, will have non-ASCII characters. 当然,某些名称将具有非ASCII字符。 When a user uploads a file, I save it in a folder with its original name. 当用户上传文件时,我将其保存在原始名称的文件夹中。 However, when I try to download it, by accessing its location (for example, files/Tolstoy - How much land does a man need?.pdf
), I get a 404. Is there some way to solve this, so that the files remain with their original name? 但是,当我尝试下载它时,通过访问它的位置(例如, files/Tolstoy - How much land does a man need?.pdf
),我得到了404。有什么方法可以解决此问题,以便文件保持原名? Via Apache, maybe? 通过Apache,也许?
Um, just use url encoding, known also as percent encoding ? 嗯,仅使用url编码,也称为百分比编码 ? that's meant to handle the urls in web. 这是为了处理网络中的网址。 All urls printed to HTML should be url encoded. 所有打印为HTML的网址均应进行网址编码。
For PHP, rawurlencode should be used, as it should be standards-compliant, which urlencode isn't. 对于PHP,应该使用rawurlencode ,因为它应该符合标准,而urlencode不是。
Edit: for this issue 编辑:此问题
PHP encodes "é" as "e%26%23769%3B", instead of "e%CC%81" PHP将“é”编码为“ e%26%23769%3B”,而不是“ e%CC%81”
e%CC%81
would be UTF-8 for é
. e%CC%81
将是é
UTF-8。 e%26%23769%3B
would be for é
e%26%23769%3B
用于é
, which is an HTML entity for the same. ,这是相同的HTML实体。 This means that you're doing either explicit htmlentities() call there before urlencoding, or your server setup does that automatically. 这意味着您可以在进行urlencoding之前在其中进行显式的htmlentities()调用,或者服务器设置会自动执行该操作。 It's not strictly needed if proper character sets are in place (only htmlspecialchars call is actually needed), but it shouldn't break anything either. 如果适当的字符集到位,则不是严格需要的(实际上只需要htmlspecialchars调用),但是它也不应该破坏任何内容。
Some online tools if you want to test these out: 一些在线工具,如果您想测试一下:
Workaround: convert filenames to ASCII at upload. 解决方法:上传时将文件名转换为ASCII。 You will be happy with it. 您将对此感到满意。
Well, for some reason that I still don't understand, using rawurlencode()
instead of urlencode()
made it work. 好吧,由于某些我仍然不了解的原因,使用rawurlencode()
而不是urlencode()
使其可行。
However, the character é
(among others, I'm sure) is still being encoded strangely ( e%26%23769%3B
instead of simply %C3%A9
). 但是,字符é
(我敢肯定,其中的其他字符)仍被奇怪地编码( e%26%23769%3B
而不是简单的%C3%A9
)。 Even stranger is that the links containing it work. 甚至更奇怪的是包含它的链接起作用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.