简体   繁体   English

编码(并重定向)到在Java中具有特殊字符的URL

[英]Encode (and redirect) to a URL that has special characters in Java

I have a URL in a String object like this: 我在这样的String对象中有一个URL:

http://bhorowitz.com/2011/03/24/bubble-trouble-i-don 't-think-so/ http://bhorowitz.com/2011/03/24/bubble-trouble-i-don't-think-so /

the URL may or may not contain unicode characters that need to be encoded. 该URL可能包含也可能不包含需要编码的unicode字符。 For example, the link above should be transformed to: 例如,上面的链接应转换为:

http://bhorowitz.com/2011/03/24/bubble-trouble-i-don%e2%80%99t-think-so/ http://bhorowitz.com/2011/03/24/bubble-trouble-i-don%e2%80%99t-think-so/

before I redirect to it. 在我重定向到它之前。

How do I properly escape all special characters (such as unicode) while keeping the rest of the URL structure intact? 如何在保留其余URL结构完整的同时正确地转义所有特殊字符(例如unicode)? Is there something out there already that will do this or do I need to roll my own? 是否已经有东西可以做到这一点,或者我需要自己动手做?

Edit: the tricky part is that I need to escape only invalid characters while leaving the rest of the URL untouched (eg http:// should remain http:// and should not be escaped). 编辑:棘手的部分是,我只需要转义无效字符,而其余URL保持不变(例如,http://应该保持http://且不应转义)。 URLEncoder, as far as I can tell, does not allow me to do this. 据我所知,URLEncoder不允许我这样做。

I think this is what you were actually looking for: 我认为这是您真正想要的:

new URL(yourURLString).toURI().toASCIIString();

It will only encode the required characters while leaving everything else untouched. 它只会对所需的字符进行编码,而其他所有内容都保持不变。

JDK ships with enough tools to handle what you want. JDK附带了足够的工具来满足您的需求。 Please reffer to documentation: http://download.oracle.com/javase/6/docs/api/java/net/URLEncoder.html and http://download.oracle.com/javase/6/docs/api/java/net/URLDecoder.html 请参考文档: http : //download.oracle.com/javase/6/docs/api/java/net/URLEncoder.htmlhttp://download.oracle.com/javase/6/docs/api/java /net/URLDecoder.html

Usage is pretty straightforward. 用法非常简单。

String decoded = URLDecoder.decode("url%20to%20decode", "UTF-8");
String encoded = URLEncoder.encode("url to decode", "UTF-8");

Please notice, that proper character encoding should be provided. 请注意,应提供正确的字符编码。 Both classes have single parameter versions of those methods, but they are considered deprecated. 这两个类都具有这些方法的单个参数版本,但是它们被认为已弃用。

I believe this does what you want. 我相信这是您想要的。 It'll encode anything not a / in the path though. 但是它将在路径中编码非/任何内容。 It's not perhaps the most elegant solution, yet it should be safe to use. 这可能不是最优雅的解决方案,但应该安全使用。

    // make sure url is valid before parsing it
    try {
        new URL(url);
    } catch (MalformedURLException e) {
        return;
    }

    StringBuilder sb = new StringBuilder();
    Scanner scanner = new Scanner(url).useDelimiter("/");

    // append the protocol part, e.g. http://
    sb.append(scanner.next());
    sb.append('/');

    // append the hostname part
    sb.append(scanner.next());
    sb.append('/');

    // encode each part of path
    while (scanner.hasNext()) {
        String part = scanner.next();
        sb.append(URLEncoder.encode(part, "UTF-8"));
        sb.append('/');
    }

    // remove trailing slash if original doesn't have one
    if (!url.endsWith("/")) {
        sb.deleteCharAt(sb.length() - 1);
    }

    String encoded = sb.toString();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM