如何使用 BeautifulSoup 从 html 标签中获取“href”

Question

I am trying to extract an image link from a table, and have gotten to the point of the "td" tag, but can't get the link inside of it.我正在尝试从表中提取图像链接，并且已经到了“td”标记的点，但无法获取其中的链接。 Here is my code:这是我的代码：

from bs4 import BeautifulSoup
import requests


def get_html(url):
    r = requests.get(url)
    r.encoding = 'utf8'
    return r.text


data = '''
<td class="cover" valign="top">
<a href="/upload/iblock/ea7/ea72966465cde6ae6674321dcd95d1af.jpg" rel="lightbox"><img alt="Пьесы" src="/upload/iblock/ea7/ea72966465cde6ae6674321dcd95d1af.jpg" title="Пьесы"/></a>
</td>
'''


def get_dt(html):
    soup = BeautifulSoup(html, 'lxml')
    a = soup.findAll('table')[1].findAll('tr')
    for tr in range(len(a)):
        b = a[tr].findAll('td')
        for td in range(len(b)):
            if tr == 0 and td == 0:
                c = b[td]
                print(c.get('href'))


def get_dt2(html):
    soup = BeautifulSoup(html, 'lxml')
    print(soup.get('href'))


# link = 'http://www.rech-deti.ru/catalog/7/61021/'
get_dt2(data)

I keep getting the output:我不断得到输出：

None

or if i use或者如果我使用

soup['href']

I get:我得到：

Traceback (most recent call last):
  File "C:/Users/Vlad/PycharmProjects/Ultimate_Parser/Rech/rech table test.py", line 42, in <module>
    get_dt2(data)
  File "C:/Users/Vlad/PycharmProjects/Ultimate_Parser/Rech/rech table test.py", line 38, in get_dt2
    print(soup['href'])
  File "C:\Users\Vlad\PycharmProjects\Ultimate_Parser\venv\lib\site-packages\bs4\element.py", line 1401, in __getitem__
    return self.attrs[key]
KeyError: 'href'

I have tried using the answers from this question: Get item from bs4.element.Tag but, neither one of them worked.我试过使用这个问题的答案： Get item from bs4.element.Tag但是，他们都没有工作。

Answer 1

Try this to get all the a elements that contain an href attribute:试试这个来获取所有包含href属性的a元素：

def get_dt2(html):
    soup = BeautifulSoup(html, 'lxml')
    for a in soup.find_all('a', href=True):
        print (a['href'])

如何使用 BeautifulSoup 从 html 标签中获取“href”

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-09-06 01:10:57

如何使用 BeautifulSoup 从 html 标签中获取“href”

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-09-06 01:10:57

解决方案1
1 已采纳 2020-09-06 01:10:57