美麗的湯和findAll（）過程

Question

我正在嘗試使用以下代碼從網站上抓取數據。 該站點需要解碼方法，我遵循了@royatirek解決方案。 我的問題是container_a最終不包含任何內容。 我在其他一些網站上也使用了類似的方法，並且可以正常工作。 但是在此站點和其他幾個站點上，我的container_a變量仍然為空列表。 干杯

from urllib.request import Request, urlopen
from bs4 import BeautifulSoup as soup
my_url = 'http://www.news.com.au/sport/afl-round-3-teams-full-lineups-and- 
the-best-supercoach-advice/news-story/dfbe9e0e68d445e07c9522a138a2b824'
req = Request(my_url, headers={'User-Agent': 'Mozilla/5.0'})
web_byte = urlopen(req).read()
webpage = web_byte.decode('utf-8')
page_soup = soup(web_byte, "html.parser")
container_a = page_soup.findAll("div",{"class":"fyre-comment-wrapper"})

Answer 1

您要解析的內容正在由JavaScript動態加載，因此requests不會為您完成工作。 您可以ChromeDriver使用selenium和ChromeDriver或任何其他驅動程序：

from selenium import webdriver
from bs4 import BeautifulSoup

driver = webdriver.Chrome()
driver.get("http://www.news.com.au/sport/afl-round-3-teams-full-lineups-and-the-best-supercoach-advice/news-story/dfbe9e0e68d445e07c9522a138a2b824")

然后，您可以使用.page_source訪問頁面源，從而根據需要繼續使用.page_source ：

page_soup = BeautifulSoup(driver.page_source, "html.parser")
container_a = page_soup.findAll("div",{"class":"fyre-comment-wrapper"})

美麗的湯和findAll（）過程

問題描述

1 個解決方案

解決方案1
1 已采納 2018-05-12 01:56:42

美麗的湯和findAll（）過程

問題描述

1 個解決方案

解決方案1 1 已采納 2018-05-12 01:56:42

解決方案1
1 已采納 2018-05-12 01:56:42