[英]How to go to next page until the last page in Python Selenium when scraping website?
Image is for CSS selector and xpath for pagination.图像用于 CSS 选择器,xpath 用于分页。
I also wanted to perform a regex in to to separate Apple, iPhone 12, Neo Galactic Silver like this I wanted to print it in new line.我还想执行正则表达式以将 Apple、iPhone 12、Neo Galactic Silver 分开,我想在新行中打印它。
After finishing the product list of this current page, I want to be able to click next and perform the same procedure with the products on the next page.完成当前页面的产品列表后,我希望能够单击下一步并与下一页上的产品执行相同的过程。
This is the problem: when it reaches the 10 items of the current page, I have no idea how to change to another page and start all over again.这就是问题所在:当它到达当前页面的10个项目时,我不知道如何切换到另一个页面并重新开始。
import xlwt
from selenium import webdriver
import re
import time
class cometmobiles:
def __init__(self):
self.url='https://www.mediaworld.it/catalogo/telefonia/smartphone-e-cellulari/smartphone'
def comet(self):
try:
driver=webdriver.Chrome()
driver.get(self.url)
time.sleep(5)
cookies = driver.find_element_by_id("onetrust-accept-btn-handler")
cookies.click()
print("accepted cookies")
driver.maximize_window()
print("window maximized")
mylist = []
hasNextPate = True
while hasNextPate:
containers = []
containters =driver.find_elements_by_css_selector('article[class="product clearfix p-list-js"]')
for container in containters:
#Title
try:
title = container.find_element_by_css_selector('h3[class="product-name"]').text
print(title)
except:
pass
#price
try:
price = container.find_element_by_css_selector('span[class="price mw-price enhanced"]').text
print(price)
except:
pass
try:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
nxt=driver.find_elements_by_css_selector('span[class="pages"] a')
time.sleep(5)
nxt.click()
except:
break
except:
pass
comets=cometmobiles()
comets.comet()
Instead of this part而不是这部分
try:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
nxt=driver.find_elements_by_css_selector('span[class="pages"] a')
time.sleep(5)
nxt.click()
except:
break
You can use this and also if the page number doesn't exist website turn the main page so you should add您可以使用它,如果页码不存在,则网站打开主页,因此您应该添加
try:
x=0
while True:
x+=1
driver.get(url+"?pageNumber="+str(x)) #Get the next page
if driver.current_url == url: #If there is no next page it will turn main page and you can break at this time
break
except:
pass
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.