![](/img/trans.png)
[英]How to web scrape multiple pages with different cities in unchanging URL - Python 3
[英]scrape multiple web pages with unchanging url by clicking next
我正在嘗試從https://ethnicelebs.com/all-celebs這個網站上抓取多個頁面,但是每個頁面的URL都保持不變。 我想從所有頁面中抓取所有字符的種族信息(單擊列出的名稱時)。
導航之后,我使用以下代碼來刮擦種族信息,但它一直在刮擦第一頁:
url = 'https://ethnicelebs.com/all-celeb'
driver = webdriver.Chrome()
driver.get(url)
while True:
page = requests.post(url)
soup = BeautifulSoup(page.text, 'html.parser')
for href in soup.find_all('a', href=True)[18:]:
print('Found the URL:{}'.format(href['href']))
request_href = requests.get(href['href'])
soup2 = BeautifulSoup(request_href.content)
for each in soup2.find_all('strong')[:-1]:
print(each.text)
Next_button = (By.XPATH, "//*[@title='Go to next page']")
WebDriverWait(driver, 50).until(EC.element_to_be_clickable(Next_button)).click()
url = driver.current_url
time.sleep(5)
(感謝@Sureshmani!)
在不斷導航而不是回到第一頁的同時,我該如何抓取當前頁面? 謝謝!
我可以使用這種方法導航到每個頁面,
url = 'https://ethnicelebs.com/all-celeb'
driver.get(url)
time.sleep(5)
i = 1
while i<52:
myxpath = "//div[contains(@data-id,'pt-cv-page-"+str(i)+"')]//h6/a"
print(myxpath)
CelebLinks = driver.find_elements_by_xpath(myxpath)
print('Total Celeb displayed in the current page', len(CelebLinks))
for links in CelebLinks:
print(str(links.text).encode('UTF-8', 'ignore'))
print('Link URl: ',links.get_attribute('href'))
request_href = requests.get(links.get_attribute('href'))
soup = BeautifulSoup(request_href.content)
for each in soup.find_all('strong')[:-1]:
print(str(each.text).encode('UTF-8', 'ignore'))
Next_button = (By.XPATH, "//*[@title='Go to next page']")
WebDriverWait(driver, 50).until(EC.element_to_be_clickable(Next_button)).click()
time.sleep(5)
i+=1
continue
您將需要以下導入
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
import time
from bs4 import BeautifulSoup
import requests
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.