I'm sending raw HTTP headers to a website, and I want to detect errors such as 400 Bad Request
or 404 Not Found
manually without using urllib
or Requests
package. I'm sending a HEAD
request like this:
head_request = "HEAD " + url_path + " HTTP/1.1\nHost: %s\r\n\r\n" % (host)
socket_id = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket_id.connect((host, 80))
socket_id.send(head_request)
recv_head = socket_id.recv(1024)
How should I manually catch Exceptions?
One way is to manually search for the HTTP response using a regular expression.
Another way is to port what you need from the http_parser.c module from the http-parser project. It can be downloaded from here: https://pypi.python.org/pypi/http-parser/
You can parse the HTTP response using http-parser which works on the socket level.
Here is the description:
http-parser provide you parser.HttpParser low-level parser in C that you can access in your python program and http.HttpStream providing higher-level access to a readable,sequential io.RawIOBase object.
Here is how you can parse the HTTP response using sockets in Python in the manner according to the example you gave:
https://github.com/benoitc/http-parser/tree/master/http_parser
def main():
p = HttpParser()
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
body = []
try:
s.connect(('gunicorn.org', 80))
s.send("GET / HTTP/1.1\r\nHost: gunicorn.org\r\n\r\n")
while True:
data = s.recv(1024)
if not data:
break
recved = len(data)
nparsed = p.execute(data, recved)
assert nparsed == recved
if p.is_headers_complete():
print p.get_headers()
if p.is_partial_body():
body.append(p.recv_body())
if p.is_message_complete():
break
print "".join(body)
finally:
s.close()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.