简体   繁体   中英

InvalidSchema("No connection adapters were found for '%s'" % url)

I was able to gather data from a web page using this

import requests
import lxml.html
import re
url = "http://animesora.com/flying-witch-episode-7-english-subtitle/"
r = requests.get(url)
page = r.content
dom =  lxml.html.fromstring(page)

for link in dom.xpath('//div[@class="downloadarea"]//a/@href'):
    down = re.findall('https://.*',link)     
    print (down)

when I try this to gather more data on the results of the above code I was presented with this error:

Traceback (most recent call last):
  File "/home/sven/PycharmProjects/untitled1/.idea/test4.py", line 21, in <module>
    r2 = requests.get(down)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 70, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 56, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 590, in send
    adapter = self.get_adapter(url=request.url)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 672, in get_adapter
    raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for '['https://link.safelinkconverter.com/review.php?id=aHR0cDovLygqKC5fKC9zTGZYZ0s=&c=1&user=51757']'

This is the code I was using:

for link2 in down:
    r2 = requests.get(down)
    page2 = r.url
    dom2 = lxml.html.fromstring(page2)

for link2 in dom2('//div[@class="button green"]//onclick'):

    down2 = re.findall('.*',down2)
    print (down2)

You are passing in the whole list :

for link2 in down:
    r2 = requests.get(down)

Note how you passed in down , not link2 . down is a list, not a single URL string.

Pass in link2 :

for link2 in down:
    r2 = requests.get(link2)

I'm not sure why you are using regular expressions. In the loop

for link in dom.xpath('//div[@class="downloadarea"]//a/@href'):

each link is already a fully qualified URL:

>>> for link in dom.xpath('//div[@class="downloadarea"]//a/@href'):
...     print link
...
https://link.safelinkconverter.com/review.php?id=aHR0cDovLygqKC5fKC9FZEk2Qg==&c=1&user=51757
https://link.safelinkconverter.com/review.php?id=aHR0cDovLygqKC5fKC95Tmg2Qg==&c=1&user=51757
https://link.safelinkconverter.com/review.php?id=aHR0cDovLygqKC5fKC93dFBmVFg=&c=1&user=51757
https://link.safelinkconverter.com/review.php?id=aHR0cDovLygqKC5fKC9zTGZYZ0s=&c=1&user=51757

You don't need to do any further processing on that.

Your remaining code has more errors; you confused r2.url with r2.content , and forgot the .xpath part in your dom2.xpath(...) query.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM