简体   繁体   中英

Scrapy Next Page Button and Previous Page Button are on the same class, can't reach the next page

I am trying to scrape one dictionary. Which has next page and previous page buttons. When I try to reach next page("Sonraki Sayfa") with this way

next_page = response.css('div.col-md-6.col-sm-6.col-xs-6 a::attr(href)').get()

I always reach the previous page button because they have same class names.

This is the html code of website:

<ul class="sayfalama">
  <div class="col-md-12 col-xs-12 col-sm-12">
    <div class="row">
      <div class="col-md-6 col-sm-6 col-xs-6">
        <a href="kelimeler.php?s=-1" style="background: white; font-weight: bold; padding:5px;">Önceki Sayfa</a>
      </div>
      <div class="col-md-6 col-sm-6 col-xs-6">
        <a href="kelimeler.php?s=1" style="background: white; font-weight: bold; padding:5px;">Sonraki Sayfa</a>
      </div>
    </div>
</ul>

This is my code of spider

next_page = response.css('div.col-md-6.col-sm-6.col-xs-6 a::attr(href)').get()
print(next_page)

if next_page is not None:

  yield response.follow(next_page, callback = self.parse)

What should change to reach next page(Sonraki Sayfa) instead of previous page( Onceki Sayfa)?

You can try with nth-child . like below:

next_page = response.css('div.col-md-6.col-sm-6.col-xs-6:nth-child(2) a::attr(href)').get()

 div.col-md-6.col-sm-6.col-xs-6:nth-child(2) a { color:red; }
 <div class="col-md-6 col-sm-6 col-xs-6"> <a href="kelimeler.php?s=-1" style="background: white; font-weight: bold; padding:5px;">Önceki Sayfa</a> </div> <div class="col-md-6 col-sm-6 col-xs-6"> <a href="kelimeler.php?s=1">Sonraki Sayfa</a> </div>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM