繁体   English   中英

HttpURLConnection字符编码

[英]HttpURLConnection Character Encoding

我正在编写一个简单的程序。 有一个使用字符集“ utf-8”的URL。 我想从此页面获取整个源,但是有字符编码问题。

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;

class WholeTest {

    HttpURLConnection conn;


    public void openUrl() throws Exception {

        URL pageUrl = new URL("http://naver.com");
        conn = (HttpURLConnection)pageUrl.openConnection();
        conn.setRequestMethod("GET");
        conn.setUseCaches(false);

        conn.setRequestProperty("Host", "naver.com");
        conn.setRequestProperty("Connection", "keep-alive");
        conn.setRequestProperty("Cache-Control", "max-age=0");
        conn.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
        conn.setRequestProperty("User-Agent","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36");
        conn.setRequestProperty("Accept-Encoding", "gzip, deflate, sdch");
        conn.setRequestProperty("Accept-Language", "ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4");


        BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream(),"utf-8"));
        String inputLine;
        StringBuffer response = new StringBuffer();

        while ((inputLine = in.readLine()) != null) {
            response.append(inputLine);
        }
        in.close();

        System.out.println("---result---");
        System.out.println(response.toString());
    }
}


public class Whole {

    public static void main(String args[]) throws Exception {
        System.out.print("Test");

        WholeTest w = new WholeTest();
        w.openUrl();
    }

}

其结果是:???????????????????????????????????????????????????? ???? 我无法查看源。 读取inputStream时使用了字符集“ utf-8”,我做错了什么?

我使用所有utf-8,UTF-8,euc-kr,EUC-KR ...相同的结果。

正如我所怀疑的那样,评论或删除以下行。 它将像魅力一样工作。

conn.setRequestProperty("Accept-Encoding", "gzip, deflate, sdch");

您期望当返回实际上是text/html时使用gzip二进制文件

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM