简体   繁体   English

使用 Beautifulsoup 获取特定属性

[英]Get specific attribute using Beautifulsoup

I want to extract attribute from within a HTML tag using beautifulsoup.我想使用 beautifulsoup 从 HTML 标签中提取属性。 How to do it ?怎么做 ?

For Example:例如:

<div class="search-pagination-top clearfix  mtop ">
                                            <div class="row"><div class="col-l-4 mtop pagination-number" tabindex="0"
aria-label="Page 1 of 15 "><div>Page <b>1</b> of <b>15</b> </div></div>

How do I get text from "aria-label" attribute ?如何从“aria-label”属性中获取文本?

I tried using select() but it didn't help.我尝试使用 select() 但它没有帮助。

You can extract the attribute value just like a dictionary. 您可以像字典一样提取属性值。 Using the key aria-label 使用关键的aria-label

Ex: 例如:

from bs4 import BeautifulSoup

html = """<div class="search-pagination-top clearfix  mtop ">
                                            <div class="row"><div class="col-l-4 mtop pagination-number" tabindex="0"
aria-label="Page 1 of 15 "><div>Page <b>1</b> of <b>15</b> </div></div>
"""

soup = BeautifulSoup(html, "html.parser")
print( soup.find("div", class_="col-l-4 mtop pagination-number")["aria-label"] )

Output: 输出:

Page 1 of 15 
from bs4 import BeautifulSoup

html_doc = """
<div class="search-pagination-top clearfix  mtop ">
                                            <div class="row"><div class="col-l-4 mtop pagination-number" tabindex="0"
aria-label="Page 1 of 15 "><div>Page <b>1</b> of <b>15</b> </div></div>
"""

soup = BeautifulSoup(html_doc, "html.parser")

print(soup.div.div.text.strip())

Page 1 of 15 第1页,共15页

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM