Scrapy 中的 Xpath 或 css 选择器属性值

Question

Hi there I am new to scrapy and I want to extract an attribute value form an html element.嗨，我是scrapy的新手，我想从一个html元素中提取一个属性值。 So what could be the right way to extract that attribute value form that html .那么从 html 中提取该属性值的正确方法是什么？ I want to extract "data-next-url" attribute我想提取“data-next-url”属性

<div class="loading_more_jobs" data-type="loading_more_jobs" style="display:none;" data-next-url="https://www.ziprecruiter.com/candidate/search?search=restaurant&amp;page=2&amp;location=Atlanta%2C+Georgia"></div>

I am using that xpath but it is not working我正在使用那个 xpath 但它不起作用

 response.xpath('//*[@class="loading_more_jobs"]/@data-next-url').extract()

Answer 1

If you check source HTML you'll find this:如果您检查源 HTML，您会发现：

  <button class="load_more_jobs" data-type="load_more_jobs" data-next-url="">Load More Job Results</button>
  <div class="loading_more_jobs" data-type="loading_more_jobs" style="display:none;"></div>

But you can get next page URL anyway:但无论如何你都可以得到下一页的 URL：

<div class="job_results" data-this-url="/candidate/search?search=restaurant&amp;location=Atlanta%2C+Georgia" data-next-url="/candidate/search?location=Atlanta%2C+Georgia&amp;page=2&amp;search=restaurant" data-type="job_results">

=> =>

response.xpath('//div[@class="job_results"]/@data-next-url').extract_first()

or或者

<link rel="next" href="https://www.ziprecruiter.com/candidate/search?location=Atlanta%2C+Georgia&amp;page=2&amp;search=restaurant">

=> =>

response.xpath('//link[@rel="next"]/@href').extract_first()

Scrapy 中的 Xpath 或 css 选择器属性值

问题描述

1 个解决方案

解决方案1
0 2018-02-28 15:25:55

Scrapy 中的 Xpath 或 css 选择器属性值

问题描述

1 个解决方案

解决方案1 0 2018-02-28 15:25:55

解决方案1
0 2018-02-28 15:25:55