简体   繁体   English

使用 bs4 抓取数据

[英]Scraping data with bs4

I have this line of html:我有这行 html:

<td title="Druid"><a href="/pvp/druid"><img src="/images/classes/druid.png" class="img-responsive center" alt="Druid"></a></td>

Using python and beautiful soup, I'd like to access the the "Druid" from <td title="Druid"> to be stored as a variable.使用 python 和漂亮的汤,我想从<td title="Druid">访问“德鲁伊”以存储为变量。

If you want to access the title of the td:如果要访问 td 的标题:

bs4.BeautifulSoup(data).find("td")["title"]

To get the attributes of a tag treat it just like a dictionary:要获取标签的属性,请将其视为字典:

soup.find('td').get('title')

or或者

soup.find('td')['title']

Note: .get('title', 'some default value') allows you to provide a default value if the key is missing or if omitted is None , while ['title'] will raise a KeyError注意: .get('title', 'some default value')允许您在缺少键或省略时提供默认值None ,而['title']将引发 KeyError

Example例子

from bs4 import BeautifulSoup
html='''<td title="Druid"><a href="/pvp/druid"><img src="/images/classes/druid.png" class="img-responsive center" alt="Druid"></a></td>'''
soup = BeautifulSoup(html)

character = soup.find('td').get('title')

print(character)

Output Output

Druid

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM