簡體   English   中英

如何使用 selenium 從網站上抓取評分和所有評論

[英]How to scrape the ratings and all the reviews from the website using selenium

我想刮掉頁面上的評分和所有評論。但找不到路徑。

enter code here
import urllib.request
from bs4 import BeautifulSoup
import csv
import os
from selenium import webdriver
from selenium.webdriver.support.select import Select
from selenium.webdriver.common.keys import Keys
import pandas as pd
import time
chrome_path =r'C:/Users/91940/AppData/Local/Programs/Python/Python39/Scripts/chromedriver.exe'
driver = webdriver.Chrome(executable_path=chrome_path)
driver.implicitly_wait(10)
driver.get("https://www.lazada.sg/products/samsung-galaxy-watch3-bt-45mm-titanium-i1156462257- 
        s4537770883.html?search=1&freeshipping=1")
product_name = driver.find_element_by_xpath('//*[@id="module_product_title_1"]/div/div/h1')
print(product_name.text)
rating = driver.find_element_by_xpath("//span[@class='score-average']")
print(rate.text)
review = driver .find_element_by_xpath('//* 
         [@id="module_product_review"]/div/div/div[3]/div[1]/div[1]')
print(review.text)

也許你的路徑有問題? (抱歉,我不在 windows 上進行測試)。 從 memory 開始,Windows 路徑使用\字符而不是/ 此外,驅動器路徑后可能需要兩個反引號( C:\\ )。

c:\\Users\91940\AppData\Local\...

我相信print(product_name.text)正在執行正確,對嗎?

driver.find_element_by_xpath("//span[@class='score-average']")存在問題,我在 HTML 源代碼中的任何地方都找不到score-average

所以試試這個:

driver.find_element_by_css_selector("div.pdp-review-summary")
print(rate.text)

您可以嘗試以下代碼以獲得審核

wait = WebDriverWait(driver, 10)
driver.get("https://www.lazada.sg/products/samsung-galaxy-watch3-bt-45mm-titanium-i1156462257- s4537770883.html?search=1&freeshipping=1")
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[class$='pdp-review-summary__link']"))).click()
ActionChains(driver).move_to_element(wait.until(EC.visibility_of_element_located((By.XPATH, "//h2[contains(text(), 'Ratings & Reviews')]")))).perform()
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.item-content")))
for review in driver.find_elements(By.CSS_SELECTOR, "div.item-content"):
    print(review.get_attribute('innerHTML'))

進口:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM