简体   繁体   English

网站抓取未显示完整的 html 元素

[英]website scraping not showing full html elements

I'm new to the world of programming and I've been trying to scrape a web site https://www.boove.se/butiker/mc but its not showing the full HTML elements of the website.我是编程界的新手,我一直在尝试抓取 web 网站https://www.boove.se/butiker/mc但它没有显示网站的完整 HTML 元素。

my code:我的代码:

   from selenium import webdriver
   from selenium.webdriver.chrome.service import Service

   chromedriver_path = "C:\chromedriver_win32\chromedriver.exe"
   driver = webdriver.Chrome(service=Service(chromedriver_path))
   url = "https://www.boove.se/butiker/mc"
   driver.get(url)
   content = driver.find_elements(by=By.CSS_SELECTOR, value= "h3 .address desktop")
   print(content)
   driver.close()

There are two issues with your code:您的代码有两个问题:

  • It needs the correct css selector to find the elements它需要正确的css selector来查找元素

  • You have to wait until the elements are located您必须等到找到元素

    wait = WebDriverWait(driver, 10) wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'h3.address.desktop')))
Example例子
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = 'https://www.boove.se/butiker/mc'

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
driver.get(url)

wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'h3.address.desktop')))

content = driver.find_elements(By.CSS_SELECTOR, 'h3.address.desktop')

for i in content:
    print(i.text)
driver.close()
Output Output
Odenskogsvägen 35, 83148 Östersund
STORGATAN 43, 77100 Ludvika
INÄLVSVÄGEN 1, 69152 Karlskoga
VÄNE-RYR, 46293 Vänersborg
Västbergavägen 24, 12630 Hägersten
TENNGATAN 15, 23435 Lomma
HÄGERSTENS ALLE 12, 12937 Hägersten
SÅGVÄGEN 11A, 18440 Åkersberga
Sollentuna, 19278 Rotebro
Kuskvägen 2, 19162 Sollentuna
Mölndalsvägen 25, 41263 GÖTEBORG
SALAVÄGEN 5, 74537 Enköping
Lundbygårdsgata 3, 72134 Västerås
...

You can easily scrape data using the Python library BeautifulSoup which is easier to use.您可以使用更易于使用的 Python 库 BeautifulSoup 轻松抓取数据。

Here is the biolerplate code,这是生物盘代码,

from bs4 import BeautifulSoup
import requests
url = ""
req = requests.get(url)
soup = BeautifulSoup(req.text, "html.parser")
print(soup)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM