繁体   English   中英

网站抓取未显示完整的 html 元素

[英]website scraping not showing full html elements

我是编程界的新手,我一直在尝试抓取 web 网站https://www.boove.se/butiker/mc但它没有显示网站的完整 HTML 元素。

我的代码:

   from selenium import webdriver
   from selenium.webdriver.chrome.service import Service

   chromedriver_path = "C:\chromedriver_win32\chromedriver.exe"
   driver = webdriver.Chrome(service=Service(chromedriver_path))
   url = "https://www.boove.se/butiker/mc"
   driver.get(url)
   content = driver.find_elements(by=By.CSS_SELECTOR, value= "h3 .address desktop")
   print(content)
   driver.close()

您的代码有两个问题:

  • 它需要正确的css selector来查找元素

  • 您必须等到找到元素

    wait = WebDriverWait(driver, 10) wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'h3.address.desktop')))
例子
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = 'https://www.boove.se/butiker/mc'

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
driver.get(url)

wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'h3.address.desktop')))

content = driver.find_elements(By.CSS_SELECTOR, 'h3.address.desktop')

for i in content:
    print(i.text)
driver.close()
Output
Odenskogsvägen 35, 83148 Östersund
STORGATAN 43, 77100 Ludvika
INÄLVSVÄGEN 1, 69152 Karlskoga
VÄNE-RYR, 46293 Vänersborg
Västbergavägen 24, 12630 Hägersten
TENNGATAN 15, 23435 Lomma
HÄGERSTENS ALLE 12, 12937 Hägersten
SÅGVÄGEN 11A, 18440 Åkersberga
Sollentuna, 19278 Rotebro
Kuskvägen 2, 19162 Sollentuna
Mölndalsvägen 25, 41263 GÖTEBORG
SALAVÄGEN 5, 74537 Enköping
Lundbygårdsgata 3, 72134 Västerås
...

您可以使用更易于使用的 Python 库 BeautifulSoup 轻松抓取数据。

这是生物盘代码,

from bs4 import BeautifulSoup
import requests
url = ""
req = requests.get(url)
soup = BeautifulSoup(req.text, "html.parser")
print(soup)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM