I am trying to scrape a Webpage which has multiple Blog Entries on the first page.
This is my code so far:
for rel in response.xpath('//*[@id="content"]/div[*]/div/comment()[2]'):
item = Example()
item['title'] = rel.xpath('//*[@id="content"]/div[*]/div/div/input/@value').extract()
item['link'] = rel.xpath('//*[@id="content"]/div[*]/div/div/span[4]/a/@href').extract()
yield item
Problem is if I go with the "*"
I get a link and a title back with all entries in it.
But I would like to have a title and a link for every single entry.
I am very new to Python and scrapy
and don't know how to count up to get the single entries back.
The first entry starts with "2"
and the next is +3
till it end at 29.(2,5,8....29)
Let me suggest more explicit XPaths. Something like should be closer to your goal:
for rel in response.xpath('//div[@class="beschreibung"]'):
item['title'] = rel.xpath(".//strong[contains(text(),"Release")]/following-sibling::*[1]/@value").extract()
item['link'] = rel.xpath('.//span[@style="display:inline;"]//a[contains(text(),"Share")]/@href').extract()
yield item
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.