如何使用漂亮的湯和 python 刮卡細節

Question

我正在嘗試抓取此鏈接： https://www.axisbank.com/retail/cards/credit-card

使用以下代碼

from urllib.request import urlopen
from bs4 import BeautifulSoup
import json, requests, re

axis_url = ["https://www.axisbank.com/retail/cards/credit-card"]

html = requests.get(axis_url[0])
soup = BeautifulSoup(html.content, 'lxml')

for d in soup.find_all('span'):
    print(d.get_text())

Output：

close
5.15%
%
4.00%
%
5.40%

基本上我想獲取該頁面中存在的每張卡片的詳細信息

我嘗試了不同的標簽，但似乎都沒有奏效。

我很高興看到滿足我要求的代碼。

非常感謝任何幫助。

Answer 1

怎么了？

您的主要問題是，該網站動態地提供其內容，並且您不會按照您的要求實現目標。 打印你的湯並看看，它不會包含你在瀏覽器中檢查的元素。

怎么修？

使用 selenium 可以處理動態生成的內容並提供您檢查過的信息：

例子

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome(executable_path=r'C:\Program Files\ChromeDriver\chromedriver.exe')
url = 'https://www.axisbank.com/retail/cards/credit-card'
driver.get(url)

soup = BeautifulSoup(driver.page_source, 'lxml')
    
driver.close()

textList = []
for d in soup.select('#ulCreditCard li li > span'):
        textList.append(d.get_text('^^', strip=True))
    
textList

如何使用漂亮的湯和 python 刮卡細節

問題描述

1 個解決方案

解決方案1
1 已采納 2021-02-16 18:47:47

怎么了？

怎么修？

如何使用漂亮的湯和 python 刮卡細節

問題描述

1 個解決方案

解決方案1 1 已采納 2021-02-16 18:47:47

怎么了？

怎么修？

解決方案1
1 已采納 2021-02-16 18:47:47