[英]Python Scrapy print start_url or variable in start_url
Trying to yield the "number" or the maybe get the start_url
then parse the start_url
to get the number: 尝试产生“数字”或可能获取
start_url
然后解析start_url
以获取数字:
class EbaypriceSpider(Spider):
name = "ebayprice"
allowed_domains = ["www.ebay.com"]
start_urls = []
with open('Numbers.csv', 'rb') as omcan_numbers:
number_list = csv.reader(omcan_numbers)
for number in number_list:
start_urls.append('http://www.ebay.com/sch/Omcan' + str(number))
def parse(self, response):
# DO stuff then call parse_page2
def parse_page2(self, response):
print number
# I want to get get start url or number
instead of start_urls
use the start_requests
method: 代替
start_urls
使用start_requests
方法:
class EbaypriceSpider(Spider):
name = "ebayprice"
allowed_domains = ["www.ebay.com"]
def start_requests(self):
with open('Numbers.csv','rb') as omcan_numbers:
number_list = csv.reader(omcan_numbers)
for number in number_list:
url = 'http://www.ebay.com/sch/Omcan'+ str(number)
yield Request(url, meta={'start_url':url}, callback=self.parse)
def parse(self, response):
# DO stuff then call parse_page2
...
# keep passing the `meta` argument from previous request
yield Request(some_other_url, meta=response.meta, callback=self.parse_page2)
def parse_page2(self, response):
# i want to get get start url or number
start_url = response.meta['start_url']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.