I'm making a spider to scrape a list of games from Nintendo. I've checked the request format with postman and with the python requests library and I get the results I need. However, when I create the request with scrapy I'm getting 400 errors.
Here's the spider:
class NintendoSpider(scrapy.Spider):
name = "nintendo"
def start_requests(self):
url = 'https://u3b6gr4ua3-dsn.algolia.net/1/indexes/*/queries'
headers = {}
headers['x-algolia-api-key'] = 'a29c6927638bfd8cee23993e51e721c9'
headers['x-algolia-application-id'] = 'U3B6GR4UA3'
formdata = {
"requests":[
{
"indexName":"store_game_en_us",
"params":'&hitsPerPage=40&maxValuesPerFacet=20&page=0'
}
]
}
yield scrapy.Request( url, method='POST', headers=headers, body=json.dumps(formdata), callback=self.parse)
def parse(self, response):
print(response)
I tried your code and it is working fine. Which means you were rate limited/banned.
Try again and see the response.json()
in parse
method. If it works, it was a temporary ban.
From now on, you need to slow down the scraping or use proxies.
Use DOWNLOAD_DELAY
and AUTOTHROTTLE_ENABLED
settings. See the documentation on this topic.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.