Scrapy-response.xpath將項目分開

Question

我正在嘗試抓取第一頁上有多個博客條目的網頁。
到目前為止，這是我的代碼：

for rel in response.xpath('//*[@id="content"]/div[*]/div/comment()[2]'):
    item = Example()
    item['title'] = rel.xpath('//*[@id="content"]/div[*]/div/div/input/@value').extract()
    item['link'] = rel.xpath('//*[@id="content"]/div[*]/div/div/span[4]/a/@href').extract()
    yield item

問題是，如果我使用"*"則會得到一個鏈接和一個包含所有條目的標題。
但我想為每個條目都提供標題和鏈接。
我是Python的scrapy ，而且scrapy ，不知道如何累加以獲取單個條目。
第一個條目以"2"開頭，下一個條目為+3直到以29.（2,5,8 .... 29）結尾。

Answer 1

讓我建議更明確的XPath。 諸如此類的東西應該更接近您的目標：

for rel in response.xpath('//div[@class="beschreibung"]'):
    item['title'] = rel.xpath(".//strong[contains(text(),"Release")]/following-sibling::*[1]/@value").extract()
    item['link'] = rel.xpath('.//span[@style="display:inline;"]//a[contains(text(),"Share")]/@href').extract()
    yield item

Scrapy-response.xpath將項目分開

問題描述

1 個解決方案

解決方案1
0 已采納 2016-03-02 14:38:21

Scrapy-response.xpath將項目分開

問題描述

1 個解決方案

解決方案1 0 已采納 2016-03-02 14:38:21

解決方案1
0 已采納 2016-03-02 14:38:21