在 python 中接受 cookies 后抓取 web 頁面

Question

我正在嘗試抓取 web 頁面，但在訪問該頁面之前，有一個接受 cookies 的橫幅。我正在使用 selenium 單擊“接受所有 cookie”按鈕，但即使單擊按鈕后我也無法訪問右 HTML 頁。

這是我的代碼：

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup

url = 'https://www.wikiparfum.fr/explore/by-name?query=dior'

driver = webdriver.Chrome(executable_path=DRIVER_PATH)

driver.get(url)
driver.find_element_by_id('onetrust-accept-btn-handler').click()

html = driver.page_source
soup = BeautifulSoup(html, 'lxml')

print(soup)

這是打印的 HTML 頁面的開頭：

如果有人可以幫助我解決這個問題，謝謝！

Answer 1

您應該等待接受 cookies 按鈕元素出現后再單擊它

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup

url = 'https://www.wikiparfum.fr/explore/by-name?query=dior'

driver = webdriver.Chrome(executable_path=DRIVER_PATH)
wait = WebDriverWait(driver, 20)

driver.get(url)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#onetrust-accept-btn-handler"))).click()

html = driver.page_source
soup = BeautifulSoup(html, 'lxml')

print(soup)

在 python 中接受 cookies 后抓取 web 頁面

問題描述

1 個解決方案

解決方案1
2 已采納 2021-08-25 13:44:06

在 python 中接受 cookies 后抓取 web 頁面

問題描述

1 個解決方案

解決方案1 2 已采納 2021-08-25 13:44:06

解決方案1
2 已采納 2021-08-25 13:44:06