簡體   English   中英

抓取圖像的網址

[英]scrapy extract the url of image

如何使用python中的scrapy從網站獲取圖像URL。請幫助我。這是我的代碼

from scrapy.spiders import CrawlSpider, Rule
#from scrapy.linkextractors.lxmlhtml import LxmlLinkExtractor
from scrapy.contrib.linkextractors import LinkExtractor
from scrapy.item import Item, Field

class MyItem(Item):
    url= Field()


class someSpider(CrawlSpider):
    name = 'crawltest'
    allowed_domains = ['bambeeq.com']
    start_urls = ['http://www.bambeeq.com/']
    rules = (Rule(LinkExtractor(allow=()), callback='parse_obj', follow=True),)

    def parse_obj(self,response):
        item = MyItem()
        item['url'] = []
        for link in LinkExtractor(allow=(),deny = self.allowed_domains).extract_links(response):
            item['url'].append(link.url)
            #item['image'].append(link.img)
        return item

您正在提取鏈接(“ a”元素),而不是圖像(“ img”元素)。 嘗試這個:

# iterate over the list of images
for image in response.xpath('//img/@src').extract():
    # make each one into a full URL and add to item[]
    item['url'].append(response.urljoin(image))

yield item

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM