So when I have this code it work perfect:
import requests
import re
def clean(toclean):
m = re.findall("'(.*?)\'", str(toclean))
rdy = ''.join([item.rstrip('\n') for item in m])
return pretty(rdy)
def pretty(pret):
m = re.findall('UA-[0-9]+-[0-9]+', str(pret))
rdy = ''.join([item.rstrip('\n') for item in m])
return rdy
r = requests.get('http://editinginsider.com')
m = re.findall('UA-[0-9]+-[0-9]+', r.text)
print clean(m)
But when I try to iterate over a list in a text file line by line I get this name or services not know error
import requests
import re
def clean(toclean):
m = re.findall("'(.*?)\'", str(toclean))
rdy = ''.join([item.rstrip('\n') for item in m])
return pretty(rdy)
def pretty(pret):
m = re.findall('UA-[0-9]+-[0-9]+', str(pret))
rdy = ''.join([item.rstrip('\n') for item in m])
return rdy
f = open( "domains.txt", "r" )
for line in f:
r = requests.get(line, timeout=7)
m = re.findall('UA-[0-9]+-[0-9]+', r.text)
print clean(m)
f.close()
So what is the deal? I have tried sleeping, timeouts, upping the max connection attempts and it fails.
My bet is something dumb.
This is caused most likely caused by a '\\n' character at the end of the line from the domain.txt file. The following might work:
r = requests.get(line.strip(), timeout=7)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.