简体   繁体   English

Scrapy:设置 cookies 以获得响应(请求中没有)

[英]Scrapy: set cookies for a response (no from request)

I need to extract some wages in USD currency, but I am accessing the page from another country, then, the currency shown is the local (riel) one and no USD.我需要以美元货币提取一些工资,但我正在从另一个国家/地区访问该页面,那么显示的货币是当地(瑞尔)货币,没有美元。 So, I am sending cookies to request a new currency and a new country所以,我发送 cookies 请求新货币和新国家

In Settings I have:在设置中我有:

COOKIES_ENABLED = False
COOKIES_DEBUG = True

In the Spider I use:在我使用的蜘蛛中:

class HtSpider(scrapy.Spider):
    name = 'sells'
    allow_domain = ['hattrick.org']

    def start_requests(self):
        urls = ['https://www.hattrick.org']
        for url in urls:
            player = 'goto.ashx?path=/Club/Players/Player.aspx?playerId=450940600'
            joint = urljoin(url, player)
            yield scrapy.Request(
                url=joint,
                cookies={'currency': 'USD', 'country': 'US'},
                # meta={'dont_merge_cookies': True},
                dont_filter=True,callback=self.price)
    def price(self,response):
       price_xpath = response.xpath('//* [@id="transferHistory"]/table//tr[1]/td[6]/text()').extract_first()
       print(price_xpath) // it is not in USD but in Riel :(
       open_in_browser(response) // to check if it is in Riel or in USD

Then, from the cookies debug I obtain:然后,从 cookies 调试我得到:

DEBUG: Sending cookies to: <GET https://www.hattrick.org/en/Club/Players/Player.aspx?playerId=450940600> 
Cookie: currency=USD; country=US; currency=USD; country=US; ASP.NET_SessionId=xxxxx
2021-01-05 16:33:13 [scrapy.downloadermiddlewares.cookies] DEBUG: Received cookies from: <200 https://www.hattrick.org/en/Club/Players/Player.aspx?playerId=450940600>
Set-Cookie: InitialOrigin=Origin=direct|&DateSet=2021-01-05 10:33:13;

**Print price: 2 280 000 Riel**

How to get the cookies I send in the request instead of the ones from the website?如何获取我发送的请求而不是来自网站的请求的 cookies? In short... how to get USD and not Riels?简而言之......如何获得美元而不是瑞尔?

First, have you tested with Postman to make sure that it actually works with this cookie?首先,您是否使用 Postman 进行了测试,以确保它确实适用于这个 cookie?

If you have COOKIES_ENABLED = False then scrapy will not send your cookies to the target server.如果你有COOKIES_ENABLED = False那么 scrapy 不会将你的 cookies 发送到目标服务器。 Since you're only sending the one request to the server the cookies from the server will not be considered.由于您只向服务器发送一个请求,因此不会考虑来自服务器的 cookies。 So setting COOKIES_ENABLED = True should solve it.所以设置COOKIES_ENABLED = True应该可以解决它。

However, if you need to send multiple requests to the server, then this might not work since the set_cookies headers from the server might override your cookie.但是,如果您需要向服务器发送多个请求,那么这可能不起作用,因为来自服务器的set_cookies标头可能会覆盖您的 cookie。

To solve this I would set COOKIES_ENABLED = False .为了解决这个问题,我会设置COOKIES_ENABLED = False Then send request like this:然后像这样发送请求:

yield scrapy.Request(
    url=joint,
    headers={
         'cookies': 'currency:USD;country:US'
    }
    dont_filter=True,callback=self.price)

I'm, using headers instead of cookies because if you have disabled cookies in settings, then cookies field would be considered.我正在使用标题而不是 cookies 因为如果您在设置中禁用了 cookies ,则将考虑 cookies 字段。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM