獲取 href 值 BeautifulSoup

Question

如果鏈接中有任何子鏈接，如何獲取 href 值？

代碼：

a_links = []

for link in links:
    response = requests.get(link)
    soup_link = BeautifulSoup(response.text, 'lxml')
    a_cont = soup_link.find_all('div', class_= 'detail__anchor-numb')
    for a in a_cont.find_all('a'):
        a_link = a['href']
        a_links.append(a_link)

輸出：

AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?

Answer 1

我相信你的錯誤在於：

 a_cont = soup_link.find_all('div', class_= 'detail__anchor-numb')
 for a in a_cont.find_all('a'):
    ...

a_cont將是一個可迭代的，因為你調用了find_all 。 如果您的意思是它是單個對象，請嘗試調用find 。

否則，盡管您的代碼將開始變得非常嵌套，但這里最簡單的答案將能夠遍歷a_cont 。 在此之后，您可能會考慮進行重構。

示例代碼：

a_cont = soup_link.find_all('div', class_= 'detail__anchor-numb')
for div in a_cont:
    for a in div.find_all('a'):

請注意，這在錯誤消息中有所指示。 Python 及其廣泛使用的軟件包非常適合提示您可能出錯的地方。 密切注意他們所說的話將非常有助於修復此類錯誤。

Answer 2

您可以應用css selectors來獲取 href ，如下所示：

a_links = []

for link in links:
    response = requests.get(link)
    soup_link = BeautifulSoup(response.text, 'lxml')
    a_cont = soup_link.select('div.detail__anchor-numb a')
    for a in a_cont:
        a_link = a['href']
        a_links.append(a_link)

獲取 href 值 BeautifulSoup

問題描述

1 個解決方案

解決方案1
0 2021-11-10 21:16:13

解決方案2
0 2021-11-11 16:02:16

獲取 href 值 BeautifulSoup

問題描述

1 個解決方案

解決方案1 0 2021-11-10 21:16:13

解決方案2 0 2021-11-11 16:02:16

解決方案1
0 2021-11-10 21:16:13

解決方案2
0 2021-11-11 16:02:16