简体   繁体   中英

python retrieving web data

I am new at Python and I have been trying to figure out the following exercise.

Exercise 5: (Advanced) Change the socket program so that it only shows data after the headers and a blank line have been received. Remember that recv is receiving characters (newlines and all), not lines.

I attached below the code I came up with, unfortunately I don't think it is working:

import socket
mysocket=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
mysocket.connect(('data.pr4e.org', 80))
mysocket.send('GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode())

count=0
while True:
          data = mysocket.recv(200)

          if (len(data) < 1): break  

          count=count+len(data.decode().strip())
          print(len(data),count)
          if count >=399:
                 print(data.decode(),end="")         
mysocket.close()

Instead of counting the number of lines received, just grab all the data you get and then split on the first double CRLF you find.

resp = []
while True:
          data = mysocket.recv(200)

          if not data: break  
          resp.append(data.decode())
mysocket.close()

resp = "".join(resp)
body = resp.partition('\r\n\r\n')[2]
print(body)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM