[英]How to add start_url as an item?
I am new to Python and Scrapy. 我是Python和Scrapy的新手。 I want
item['Source_Website']
to be the url I am crawling. 我希望
item['Source_Website']
是我要爬网的网址。 How can I achieve this? 我该如何实现?
I tried item['Source_Website'] = selector.ulr
and item['Source_Website'] = start_urls
but no luck. 我尝试了
item['Source_Website'] = selector.ulr
和item['Source_Website'] = start_urls
但是没有运气。
from scrapy.selector import Selector
from scrapy.spider import BaseSpider
from shikari.items import ShikariItem
class Radiate (BaseSpider) :
name = "sss"
download_delay = 3
concurrent_requests = 1
allowed_domains = ["website.com"]
start_urls = ['http://www.website.com/1',
'http://www.website.com/2']
def parse(self, response) :
sel = Selector (response)
item = ShikariItem ()
item['Heading'] = str (sel.xpath ('//h1/text()').extract ())
item['Source_Website'] =
return item
如下使用response.url
:
item['Source_Website'] = response.url
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.