简体   繁体   English

用Java中的套接字解​​析和发送HTTP请求的正确方法是什么?

[英]What is the proper method to parse and send HTTP requests using sockets in Java?

I'm creating a basic local proxy server, the goal is to accept http and https traffic from my web browser, parse it for information, send and receive the requests to the proper host, and then return it to the web browser. 我正在创建一个基本的本地代理服务器,目标是从Web浏览器接受HTTP和https流量,解析它以获取信息,将请求发送到适当的主机,然后将其返回到Web浏览器。

I currently have an open socket to my web browser. 我目前在网络浏览器上有一个开放的套接字。 I am receiving both http and https requests from the browser like so: 我收到来自浏览器的http和https请求,如下所示:

HTTP: HTTP:

GET http://example.com/ HTTP/1.1 
Host: example.com User-Agent:
Mozilla/5.0 (X11; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5 
Accept-Encoding: gzip, deflate
Connection: keep-alive 
Upgrade-Insecure-Requests: 1

HTTPS: HTTPS:

CONNECT example.com:443 HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0
Proxy-Connection: keep-alive
Connection: keep-alive
Host: example.com:443

I open a socket to the "Host:" from the above with the following code: 我使用以下代码从上面打开“主机:”的套接字:

public void sendRequest() throws IOException{
        Socket socket = new Socket(host, port);
        //socket.getInputStream.read();
        BufferedWriter out = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream(), "UTF8"));
        BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
        for(int i = 0; i < lines.size(); i++){
            out.write(lines.get(i) + "\r\n");
        }
        out.flush();
        outputReturn(in);
    }

And I receive the reply like so: 我收到这样的答复:

public void outputReturn(BufferedReader in){
        try{
            System.out.println("\n * Response");
            String line;
            while ((line = in.readLine()) != null) {
                System.out.println(line);
            }
        }
        catch (IOException i){
            System.out.println(i);
        }
    }

The replies come back as so: 回复如下:

HTTP: HTTP:

* Response
HTTP/1.1 200 OK
Content-Encoding: gzip
Accept-Ranges: bytes
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Wed, 10 Apr 2019 22:53:28 GMT
Etag: "1541025663+gzip"
Expires: Wed, 17 Apr 2019 22:53:28 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (ord/4C92)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 606

;�R�TA��0
         ��W�ri]��S�V @���1k��Z��$�6���q۽���@+���l�I�I��s�PzUe���Bf
                                                                   �'��+�>���+�OF   �I4h��^@^
�ЧA�p@�M���u����������*
<�|ԅߎP���P�-�6�O��$}�Jl)ǰ_,�4yU�rQazw�r���t
                                           .�s���3�
                                                   z�_������2�Mel
                                                                 ϋ5����%�t
                                                                          뫪R���t3

��:�|�Q��]���
             V-z�|�Y3*���rKp�5th��"��C���NH����v��OOyޣ�xs�����V��$��X�6�BR�b�C��PqE���K�<�  �G�כ7����E(17Vx2�US��
%   x��)�d�����e��O&�4/䤘���~��Oi�s�X�dW�7��#�u�"��y\$]j<�L�r�˻'�ɪ�Vg?Kr {=��΋]E��^x;�ƱX
                                                                                            TU��]�[�{��s+�e����9�g���]����H�4���#�KA��'�Z�����*r�
�$�G�   ��4�n�8���㊄+c���E�hA��X���������L��RIt�[4\����

HTTPS: HTTPS:

CONNECT getpocket.cdn.mozilla.net:443 HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0
Proxy-Connection: keep-alive
Connection: keep-alive
Host: getpocket.cdn.mozilla.net:443


 * Response
java.net.SocketException: Connection reset

Questions: 问题:

Why do I receive what seems like binary from the HTTP request? 为什么我从HTTP请求中收到看起来像二进制的东西?

Why so I receive nothing from my HTTPS request? 为什么为什么我的HTTPS请求什么都没收到?

What SHOULD I be doing instead? 我应该怎么做呢?

Thanks in advance. 提前致谢。

For your HTTP request, the Content-Encoding is gzip . 对于您的HTTP请求, Content-Encodinggzip The binary is the gzip-compressed data. 二进制文件是gzip压缩的数据。

For your HTTPS request, you're not making an SSL/TLS handshake, so the server drops the connection. 对于您的HTTPS请求,您没有进行SSL / TLS握手,因此服务器断开了连接。

For HTTP, I don't think you need to do anything, the browser should handle it for you. 对于HTTP,我认为您不需要做任何事情,浏览器应该为您处理。 There's no feasible way to proxy an HTTPS/SSL/TLS using the method you described. 没有使用您描述的方法代理HTTPS / SSL / TLS的可行方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM