Python 美湯 find_all()

Question

我正在嘗試在下面的 html 上使用 find_all()；

http://www.simon.com/mall

根據其他線程的建議，我通過以下站點運行鏈接並發現錯誤，但我不確定顯示的錯誤如何影響我在 Beautiful Soup 中嘗試做的事情。

https://validator.w3.org/

這是我的代碼；

from requests import get

url = 'http://www.simon.com/mall'
response = get(url)

from bs4 import BeautifulSoup

html = BeautifulSoup(response.text, 'html5lib')
mall_list = html.find_all('div', class_ = 'col-xl-4 col-md-6 ')

print(type(mall_list))
print(len(mall_list))

結果是；

"C:\Program Files\Anaconda3\python.exe" C:/Users/Chris/PycharmProjects/IT485/src/GetMalls.py
<class 'bs4.element.ResultSet'>
0

Process finished with exit code 0

我知道 HTML 中有數百個這樣的 div。 為什么我沒有得到任何匹配？

Answer 1

我有時也會使用 BeautifulSoup。 問題在於你獲取屬性的方式。 完整的工作代碼如下所示：

import requests
from bs4 import BeautifulSoup

url = 'http://www.simon.com/mall'
response = requests.get(url)
html = BeautifulSoup(response.text)
mall_list = html.find_all('div', attrs={'class': 'col-lg-4 col-md-6'})[1].find_all('option')
malls = []

for mall in mall_list:
    if mall.get('value') == '':
        continue
    malls.append(mall.text)

print(malls)
print(type(malls))
print(len(malls))

Answer 2

您的代碼看起來不錯，但是，當我訪問 simon.com/mall 鏈接並檢查 Chrome Dev Tools 時，似乎沒有“col-xl-4 col-md-6”類的任何實例。

嘗試使用“col-xl-2”測試您的代碼，您應該會看到一些結果。

Answer 3

假設您正在嘗試從該頁面（在您的腳本中提到）解析不同產品的標題和位置。 問題是該頁面的內容是動態生成的，因此您無法通過請求捕獲它； 相反，您需要使用任何瀏覽器模擬器，如 selenium，這是我在下面的代碼中所做的。 試試這個：

from selenium import webdriver
from bs4 import BeautifulSoup
import time

driver = webdriver.Chrome()
driver.get('http://www.simon.com/mall')
time.sleep(3)

soup = BeautifulSoup(driver.page_source, 'lxml')
driver.quit()

for item in soup.find_all(class_="mall-list-item-text"):
    name = item.find_all(class_='mall-list-item-name')[0].text
    location = item.find_all(class_='mall-list-item-location')[0].text
    print(name,location)

結果：

ABQ Uptown Albuquerque, NM
Albertville Premium Outlets® Albertville, MN
Allen Premium Outlets® Allen, TX
Anchorage 5th Avenue Mall Anchorage, AK
Apple Blossom Mall Winchester, VA

Python 美湯 find_all()

問題描述

3 個解決方案

解決方案1
1 2017-10-31 04:09:12

解決方案2
1 2017-10-31 04:24:33

解決方案3
1 2017-10-31 06:54:32

Python 美湯 find_all()

問題描述

3 個解決方案

解決方案1 1 2017-10-31 04:09:12

解決方案2 1 2017-10-31 04:24:33

解決方案3 1 2017-10-31 06:54:32

解決方案1
1 2017-10-31 04:09:12

解決方案2
1 2017-10-31 04:24:33

解決方案3
1 2017-10-31 06:54:32