使用 bs4 抓取数据

Question

I have this line of html:我有这行 html：

<td title="Druid"><a href="/pvp/druid"><img src="/images/classes/druid.png" class="img-responsive center" alt="Druid"></a></td>

Using python and beautiful soup, I'd like to access the the "Druid" from <td title="Druid"> to be stored as a variable.使用 python 和漂亮的汤，我想从<td title="Druid">访问“德鲁伊”以存储为变量。

Answer 1

If you want to access the title of the td:如果要访问 td 的标题：

bs4.BeautifulSoup(data).find("td")["title"]

Answer 2

To get the attributes of a tag treat it just like a dictionary:要获取标签的属性，请将其视为字典：

soup.find('td').get('title')

or或者

soup.find('td')['title']

Note: .get('title', 'some default value') allows you to provide a default value if the key is missing or if omitted is None , while ['title'] will raise a KeyError注意： .get('title', 'some default value')允许您在缺少键或省略时提供默认值None ，而['title']将引发 KeyError

Example例子

from bs4 import BeautifulSoup
html='''<td title="Druid"><a href="/pvp/druid"><img src="/images/classes/druid.png" class="img-responsive center" alt="Druid"></a></td>'''
soup = BeautifulSoup(html)

character = soup.find('td').get('title')

print(character)

Output Output

Druid

使用 bs4 抓取数据

问题描述

2 个解决方案

解决方案1
0 2022-01-17 20:38:42

解决方案2
0 2022-01-17 21:01:55

Example例子

Output Output

使用 bs4 抓取数据

问题描述

2 个解决方案

解决方案1 0 2022-01-17 20:38:42

解决方案2 0 2022-01-17 21:01:55

Example例子

Output Output

解决方案1
0 2022-01-17 20:38:42

解决方案2
0 2022-01-17 21:01:55