简体   繁体   English

python scrapy start_urls

[英]Python scrapy start_urls

is it possible to do something like below but with multiple url like below? 是否可以执行以下操作,但可以使用多个URL,如下所示? Each link will have about 50 pages to crawl and loop. 每个链接将有大约50页要爬网和循环。 The current solution is working but only working if I use 1 URL instead of multiple urls. 当前的解决方案是有效的,但是仅当我使用1个URL而不是多个URL时才有效。

 start_urls = [

'https://www.xxxxxxx.com.au/home-garden/page-%s/c18397' % page for page in range(1, 50),
'https://www.xxxxxxx.com.au/automotive/page-%s/c21159' % page for page in range(1, 50),
'https://www.xxxxxxx.com.au/garden/page-%s/c25449' % page for page in range(1, 50),
 ]

We can perform the operation by using another list. 我们可以通过使用另一个列表来执行操作。 I've shared the code for it below. 我在下面共享了它的代码。 Hope this is what you're looking for. 希望这是您想要的。

final_urls=[]
start_urls = [
'https://www.xxxxxxx.com.au/home-garden/page-%s/c18397',
'https://www.xxxxxxx.com.au/automotive/page-%s/c21159',
'https://www.xxxxxxx.com.au/garden/page-%s/c25449']
final_urls.extend(url % page for page in range(1, 50) for url in start_urls)
Output Snippet 输出片段
def parse(self, response):

    for link in final_urls:
        request = scrapy.Request(link)
        yield request

About your latest enquiry, have you tried this? 关于您的最新查询,您是否尝试过?

 def parse(self, response): for link in final_urls: request = scrapy.Request(link) yield request 

I recommend to use start_requests for this: 我建议为此使用start_requests

def start_requests(self):
    base_urls = [

        'https://www.xxxxxxx.com.au/home-garden/page-{page_number}/c18397',
        'https://www.xxxxxxx.com.au/automotive/page-{page_number}/c21159',
        'https://www.xxxxxxx.com.au/garden/page-{page_number}/c25449',
    ]

    for page in range(1, 50):
        for base_url in base_urls:
            url = base_url.format( page_number=page )
            yield scrapy.Request( url, callback=self.parse )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM