简体   繁体   中英

AttributeError: 'Selector' object has no attribute 'find' (Scrapy)

The scrapy error I get is:

  File "/anaconda/lib/python2.7/site-packages/scrapy/http/response/text.py", line 82, in urljoin
    return urljoin(get_base_url(self), url)
  File "/anaconda/lib/python2.7/urlparse.py", line 261, in urljoin
    urlparse(url, bscheme, allow_fragments)
  File "/anaconda/lib/python2.7/urlparse.py", line 143, in urlparse
    tuple = urlsplit(url, scheme, allow_fragments)
  File "/anaconda/lib/python2.7/urlparse.py", line 182, in urlsplit
    i = url.find(':')
AttributeError: 'Selector' object has no attribute 'find'

Scrapy traced the call back to this line in my spider:

for url in links:
    link_url = response.urljoin(url)

This line is in a generic parse() method. I have ran the exact same syntax many times before and never encountered an error, and wading through the documentation and source code for urllib did not yield anything.

Any advice would be greatly appreciated!

Factors that triggers your error

  • The environment you are using python27
  • You had sent a scrapy.selector object to urljoin

How to re-trigger the error

  • Activate anaconda python 2.7 environment

    • Open a scrapy shell with target url www.bing.com

       scrapy shell www.bing.com 
    • Import Selector from scrapy.selector using:

       from scrapy.selector import Selector 
    • Create a Selector object from your response

       selector_obj = Selector(response=response) 
    • Use response.urljoin to join the Selector object

       response.urljoin(selector_obj) 
    • The same error occurs 在此处输入图片说明

How to fix your error

  • Check the url variable using type() or other technique, make sure you had extract the string you desired properly

     for url in links: link_url = response.urljoin(url) 
  • Use python 3.x instead of python 2.7 , when scrapy runs with python 3.x , the error message will be much clear and easy to understand. (Here is the same error in python36 environment)

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM