[英]C language, get HTML source
I'm trying to get the HTML of this page http://pastebin.com/raw/7y7MWssc using C. So far I'm trying to connect to pastebin using sockets & port 80, and then use a HTTP request to get the HTML on that pastebin page. 我正在尝试使用C获取此页面的HTML http://pastebin.com/raw/7y7MWssc 。到目前为止,我正在尝试使用套接字和端口80连接到pastebin,然后使用HTTP请求获取该pastebin页面上的HTML。
I know what I have so far is probably WAY off, but here it is: 我知道到目前为止我可能还差得远,但是这里是:
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
int main()
{
/*Define socket variables */
char host[1024] = "pastebin.com";
char url[1024] = "/raw/7y7MWssc";
char request[2000];
struct hostent *server;
struct sockaddr_in serverAddr;
int portno = 80;
printf("Trying to get source of pastebin.com/raw/7y7MWssc ...\n");
/* Create socket */
int tcpSocket = socket(AF_INET, SOCK_STREAM, 0);
if(tcpSocket < 0) {
printf("ERROR opening socket\n");
} else {
printf("Socket opened successfully.\n");
}
server = gethostbyname(host);
serverAddr.sin_port = htons(portno);
if(connect(tcpSocket, (struct sockaddr *) &serverAddr, sizeof(serverAddr)) < 0) {
printf("Can't connect\n");
} else {
printf("Connected successfully\n");
}
bzero(request, 2000);
sprintf(request, "Get %s HTTP/1.1\r\n Host: %s\r\n \r\n \r\n", url, host);
printf("\n%s", request);
if(send(tcpSocket, request, strlen(request), 0) < 0) {
printf("Error with send()");
} else {
printf("Successfully sent html fetch request");
}
printf("test\n");
}
The code above made sense to a certain point, and now I'm confused. 上面的代码在一定程度上说得通,现在我很困惑。 How would I make this get the web source from http://pastebin.com/raw/7y7MWssc ?
我如何才能从http://pastebin.com/raw/7y7MWssc获得Web来源?
Fixed, i needed to set add 固定,我需要设置添加
serverAddr.sin_family = AF_INET;
and bzero serverAddr, and also my HTTP request was wrong, it had an extra /r/n and spaces, like @immibis said. 和bzero serverAddr,还有我的HTTP请求是错误的,它有一个额外的/ r / n和空格,如@immibis所说。
Corrected: 已更正:
sprintf(request, "GET %s HTTP/1.1\r\nHost: %s\r\n\r\n", url, host);
You are getting the pointer returned by gethostbyname() but you weren't doing anything with it. 您正在获取由gethostbyname()返回的指针,但并未对其进行任何操作。
You need to populate the sockaddr_in with the address, domain and port. 您需要使用地址,域和端口填充sockaddr_in。
This works...but now you need to worry about obtaining the response... 这行得通...但是现在您需要担心获得响应...
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
int main()
{
/*Define socket variables */
char host[1024] = "pastebin.com";
char url[1024] = "/raw/7y7MWssc";
char request[2000];
struct hostent *server;
struct sockaddr_in serverAddr;
short portno = 80;
printf("Trying to get source of pastebin.com/raw/7y7MWssc ...\n");
/* Create socket */
int tcpSocket = socket(AF_INET, SOCK_STREAM, 0);
if(tcpSocket < 0) {
printf("ERROR opening socket\n");
exit(-1);
} else {
printf("Socket opened successfully.\n");
}
if ((server = gethostbyname(host)) == NULL) {
fprintf(stderr, "gethostbybname(): error");
exit(-1);
}
memcpy(&serverAddr.sin_addr, server -> h_addr_list[0], server -> h_length);
serverAddr.sin_family = AF_INET;
serverAddr.sin_port = htons(portno);
if(connect(tcpSocket, (struct sockaddr *) &serverAddr, sizeof(serverAddr)) < 0) {
printf("Can't connect\n");
exit(-1);
} else {
printf("Connected successfully\n");
}
bzero(request, 2000);
sprintf(request, "Get %s HTTP/1.1\r\n Host: %s\r\n \r\n \r\n", url, host);
printf("\n%s", request);
if(send(tcpSocket, request, strlen(request), 0) < 0) {
printf("Error with send()");
} else {
printf("Successfully sent html fetch request");
}
printf("test\n");
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.