How to fix “IndexError: list index out of range”?

Question

I am scraping a directory with python 3 scrapy. The data scraped is added in a Mysql database throught pipelines.py

I get this error message "IndexError: list index out of range" very often.

For this question, it happen when I scraped the url of a link. Sometimes the directory publish the website of the item, sometimes not.

I didn't find any solutions on stackoverflows. I tried to convert in string but it doesn't work.

this is the line of code which create this error:

items['startup_website'] = response.xpath("//div[@class='listing-detail- section-content-wrapper']//a/@href")[0].get() or ''

Does anyone knows how can I fix this error?

Answer 1

The indexing is unnecesary; you should skip it altogether.

.xpath() returns a SelectorList , which has a .get() method of its own.
Using this will get you the wanted result:

>>> fetch('http://example.com')
2019-08-14 14:28:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://example.com> (referer: None)
>>> response.xpath('//a/@href').get('')
'http://www.iana.org/domains/example'
>>> response.xpath('//fake/a/@href').get('')
''

Answer 2

[0] is excessive here. use response.xpath("//selector").get() or ''

How to fix “IndexError: list index out of range”?

Question

2 answers

solution1
3 2019-08-14 12:28:40

solution2
1 2019-08-14 12:27:19

How to fix “IndexError: list index out of range”?

Question

2 answers

solution1 3 2019-08-14 12:28:40

solution2 1 2019-08-14 12:27:19

solution1
3 2019-08-14 12:28:40

solution2
1 2019-08-14 12:27:19