简体   繁体   English

难以从多个站点获取价值

[英]scrapy getting values from multiple sites

I'm trying to pass a value from a function. 我正在尝试从函数传递值。

i looked up the docs and just didn't understand it. 我看了看文档,只是不明白。 ref: 参考:

def parse_page1(self, response):
    item = MyItem()
    item['main_url'] = response.url
    request = scrapy.Request("http://www.example.com/some_page.html",
                             callback=self.parse_page2)
    request.meta['item'] = item
    yield request

def parse_page2(self, response):
    item = response.meta['item']
    item['other_url'] = response.url
    yield item

here is a psudo code of what i want to achive: 这是我想要达到的伪代码:

import scrapy

class GotoSpider(scrapy.Spider):
    name = 'goto'
    allowed_domains = ['first.com', 'second.com]
    start_urls = ['http://first.com/']

def parse(self, response):
    name = response.xpath(...)
    price = scrapy.Request(second.com, callback = self.parse_check)
    yield(name, price)


def parse_check(self, response):
    price = response.xpath(...)
    return price

This is how you can pass any value, link etc to other methods: 这是您可以将任何值,链接等传递给其他方法的方式:

import scrapy

class GotoSpider(scrapy.Spider):
    name = 'goto'
    allowed_domains = ['first.com', 'second.com']
    start_urls = ['http://first.com/']

    def parse(self, response):
        name = response.xpath(...)
        link = response.xpath(...)  # link for second.com where you may find the price
        request = scrapy.Request(url=link, callback = self.parse_check)
        request.meta['name'] = name
        yield request

    def parse_check(self, response):
        name = response.meta['name']
        price = response.xpath(...)
        yield {"name":name,"price":price} #Assuming that in your "items.py" the fields are declared as name, price

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM