Stripping headers response - Python

Question

A typical HTTP 1.0 header looks like this:

Server: nginx/1.6.2 (Ubuntu)
Date: Thu, 03 Mar 2016 07:00:00 GMT
Content-Type: text/html
Content-Length: 13471
Last-Modified: Sat, 19 Dec 2015 02:42:32 GMT
Connection: close
ETag: "5674c418-349f"
Cache-Control: no-store
Accept-Ranges: bytes

<!doctype html> // or <!DOCTYPE html>
# remaining of the page content here.

What's the easiest way for me to separate the beginning of the page (marked by <!doctype html> or <!DOCTYPE html> from the header of the HTTP request? For example

response = get_response() # get response is a string containing the page.
tokens = response.split("<!doctype html>") # won't work well.
return ''.join(tokens)

won't work well. I was looking into a way to split between the first half (header response) and the second half (the body)

Answer 1

You could just use find() with a lowercase version of the response as follows:

response = """
Server: nginx/1.6.2 (Ubuntu)
Date: Thu, 03 Mar 2016 07:00:00 GMT
Content-Type: text/html
Content-Length: 13471
Last-Modified: Sat, 19 Dec 2015 02:42:32 GMT
Connection: close
ETag: "5674c418-349f"
Cache-Control: no-store
Accept-Ranges: bytes

<!doctype html> // or <!DOCTYPE html>
# remaining of the page content here.
"""

print response[response.lower().find('<!doctype html>'):]

This would print:

<!doctype html> // or <!DOCTYPE html>
# remaining of the page content here.

Or perhaps just search for <!doctype

Stripping headers response - Python

Question

1 answers

solution1
2 ACCPTED 2016-03-03 07:17:47

Stripping headers response - Python

Question

1 answers

solution1 2 ACCPTED 2016-03-03 07:17:47

solution1
2 ACCPTED 2016-03-03 07:17:47