My simple crawlspider is bellow. How can I add X-Forwarded-For to this crawler? The X-Forwarded-For should be for all pages which will be crawled.
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
from scrapy.http.request import Request
class MySpider(CrawlSpider):
name = 'spidy'
allowed_domains = ['website.com', 'www.website.com']
start_urls = ['http://www.website.com/']
rules = (
Rule(LinkExtractor(allow=('/uk/', )), callback='parse_item', follow=True),
)
def parse_item(self, response):
print(response.url)
PS I found a way to do it via settings.py but is there a way via the spider ? Thank you!
You can achieve this by using the process_request
function in the Rule
object as below
rules = (Rule(LinkExtractor(allow=('/uk/', )), callback='parse_item', follow=True, process_request='add_header'),)
def add_header(self, request, response):
request.headers['X-Forwarded-For'] = 'the_header_value'
return request
See the docs for further information.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.