简体   繁体   English

正确的HTTP GET请求

[英]Proper HTTP GET Request

I am having some difficulty understanding the concept of an HTTP GET Request, besides the fact that I know it asks to read a web page from a server. 除了我知道它要求从服务器读取网页之外,我在理解HTTP GET请求的概念方面有些困难。 Today I wrote a class that tries to use a HTTP GET Request in order to gain access to the html material on the webpage. 今天,我写了一个类,尝试使用HTTP GET Request来访问网页上的html资料。 Let me include the class and explain my confusion: 让我加入课程并解释我的困惑:

    import java.io.*;
import java.net.*;

public class HTMLFetcher 
{
    private static final int PORT = 80;
    private URL url;


    public HTMLFetcher(String url) throws Exception // url = http://www.-----.com/birds.html
    {
        this.url = new URL(url);
        fetch(this.url.getHost());
    }

    private  String createRequest(URL url) { // Is there a problem with this request? 
        String request = "GET" + "/index.html" + "HTTP/1.1\n";
        request += "Host: www.cs.usfca.edu\n";
        request += "Connection: close";
        request += "\r\n";
        return request;
        }

    public void fetch(String urlDomain) throws Exception {

        System.out.println(urlDomain + ":" + PORT);

        // TODO: create a new socket here for a given urlDomain and a given PORT
        Socket socket = new Socket(urlDomain, PORT);

        // TODO: create PrintWriter for the socket's output stream
        PrintWriter writer = 
                new PrintWriter(new OutputStreamWriter(socket.getOutputStream()));

        BufferedReader reader = 
                new BufferedReader(new InputStreamReader(socket.getInputStream()));

        String request = createRequest(urlDomain); // createRequest is complaining       that it is a string and not a URL 
        System.out.println(request);
        writer.write(request);
        writer.flush();

        StringBuilder string = new StringBuilder();
        boolean htmlFound = false;
        String line;
        while ((line = reader.readLine()) != null) {
            if (!htmlFound) {
                if (line.toLowerCase().startsWith("<html>")) {
                    htmlFound = true;
                } else {
                    continue;
                }
            }
            System.out.println("This is each line: " + line);
            string.append(line + "\n");
        }

        reader.close();
        writer.close();
        socket.close();

        //System.out.println(string.toString());
        System.out.println("[done]");
    }
    }

So basically I am confused as to how I can send a String urlDomain into the createRequest method when it is expecting a URL? 因此,基本上我对如何在期望URL时将String urlDomain发送到createRequest方法感到困惑? Is the createMethod parameter necessary for the HTTP Request? HTTP请求是否需要createMethod参数? Am I setting up the request properly? 我是否正确设置了请求?

Right now it is outputting the following: 现在,它正在输出以下内容:

www.cs.usfca.edu:80
GET/index.htmlHTTP/1.1
Host: www.cs.usfca.edu
Connection: close

This is each line: <html><head>
This is each line: <title>501 Method Not Implemented</title>
This is each line: </head><body>
This is each line: <h1>Method Not Implemented</h1>
This is each line: <p>GET/index.htmlHTTP/1.1 to /index.html not supported.<br />
This is each line: </p>
This is each line: <hr>
This is each line: <address>Apache/2.2.15 (CentOS) Server at www.cs.usfca.edu Port 80</address>
This is each line: </body></html>
[done]

Thank you for your help. 谢谢您的帮助。 Please let me know if I can be more specific. 请让我知道是否可以更具体。 Thanks. 谢谢。

As I understand, the host header in the request is used when the website is on a shared hosting server, where multiple domains will be mapped to same ip and the server need the Host header to identify the virtual server to which the request to be routed. 据我了解,请求的主机标头是在网站位于共享主机服务器上时使用的,其中多个域将映射到同一IP,并且服务器需要Host标头来标识将请求路由到的虚拟服务器。 So its always better to include that in the request. 因此,最好将其包含在请求中。

BTW, in the current code, there is no spaces in the request string. 顺便说一句,在当前代码中,请求字符串中没有空格。 That's why you are getting the error html as response. 这就是为什么您得到错误html作为响应的原因。

private String createRequest(String url) { // Is there a problem with this request? 
    String request = "GET " + "/ " + "HTTP/1.1\r\n";
    request += "Host: www.cs.usfca.edu\n";
    request += "\r\n";
    return request;
}

Also, don't check like this 另外,不要这样检查

if (line.toLowerCase().startsWith("<html>")) 

Instead use 改为使用

if (line.toLowerCase().startsWith("<html")) 

BTW, why do you have to do it the hard way? 顺便说一句,为什么你必须要努力呢? Go for HTTPUrlConnection instead. 改为使用HTTPUrlConnection。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM