繁体   English   中英

获取 html 页面使用获取 http 请求和 sockets 在 Z0D61F8370CAD1D412F870B84D143E1

[英]Getting html page using get http request and sockets in C

我正在尝试打印 html 页面https://pastebin.com/raw/7y7MWssc

这是我的代码:

#include <stdio.h>
#include <winsock2.h>

void main(){
    WSADATA WSA;
    WSAStartup(MAKEWORD(2,2),&WSA) ;

    char  Request[] = "GET /raw/7y7MWssc HTTP/1.1\r\n\r\n" ;
    char  Response[2000] ;

    SOCKET  Socket = socket(AF_INET , SOCK_STREAM , 0 );
    struct sockaddr_in Server;
    struct hostent *H = gethostbyname("pastebin.com") ;

    Server.sin_addr.s_addr  = *( (int *)H->h_addr);
    Server.sin_family       = AF_INET;
    Server.sin_port         = htons( 80 );

    connect(Socket , (struct sockaddr *)&Server , sizeof(Server)) ;
    send(Socket , Request , strlen(Request) , 0)  ;
    int Rs = recv(Socket , Response , 2000 , 0) ;
    Response[Rs] = 0 ;
    printf("%s\n", Response );

    closesocket(Socket);
    WSACleanup();
}

但是我不断收到 400 Bad Request 作为响应,但是当请求是“GET /raw/7y7MWssc HTTP/1.1\r\nHost: pastebin.com\r\n\r\n”时,我得到 301 Moved Permanently to the Location: https://pastebin.com/raw/7y7MWssc感谢您的帮助

您收到400 Bad Request错误,因为您发送的请求无效。 您已声明您正在使用HTTP/1.1 ,但HTTP/1.1请求必须包含Host: header。 比较一下:

$ telnet pastebin.com 80
Trying 104.23.98.190...
Connected to pastebin.com.
Escape character is '^]'.
GET /raw/7y7MWssc HTTP/1.1

HTTP/1.1 400 Bad Request
Server: cloudflare
Date: Wed, 14 Apr 2021 17:48:12 GMT
Content-Type: text/html
Content-Length: 155
Connection: close
CF-RAY: -

对此:

$ telnet pastebin.com 80
Trying 104.23.98.190...
Connected to pastebin.com.
Escape character is '^]'.
GET /raw/7y7MWssc HTTP/1.1
Host: pastebin.com

HTTP/1.1 301 Moved Permanently
Date: Wed, 14 Apr 2021 17:49:32 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Cache-Control: max-age=3600
Expires: Wed, 14 Apr 2021 18:49:32 GMT
Location: https://pastebin.com/raw/7y7MWssc
cf-request-id: 097319bbec00005b119e0df000000001
Server: cloudflare
CF-RAY: 63fec562aa1c5b11-IAD

请注意第二个请求(包括Host: header)如何返回预期响应而不是400 Bad Request错误。


要实际从最终位置获取内容,您需要更新代码以发出https请求。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM