简体   繁体   English

如何使用 Beautiful Soup 按属性值选择标签

[英]How to select tags by attribute value with Beautiful Soup

I have the following HTML fragment:我有以下 HTML 片段:

>>> a
<div class="headercolumn">
<h2>
<a class="results" data-name="result-name" href="/xxy> my text</a>
</h2>

I am trying to select header column only if attribute data-name="result-name"仅当属性 data-name="result-name" 时,我才尝试选择标题列

I've tried:我试过了:

>>> a.select('a["data-name="result-name""]')

This gives:这给出:

ValueError: Unsupported or invalid CSS selector: 

How can I get this working?我怎样才能让它工作?

You can simply do this :你可以简单地这样做:

soup = BeautifulSoup(html)
results = soup.findAll("a", {"data-name" : "result-name"})

Source : How to find tags with only certain attributes - BeautifulSoup来源: 如何查找仅具有某些属性的标签 - BeautifulSoup

html = """
<div class="headercolumn">
<h2>
<a class="results" data-name="result-name" href="/xxy> my text</a>
</h2>
"""

from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
for d in soup.findAll("div",{"class":"headercolumn"}):
    print d.a.get("data-name")
    print d.select("a.results")

result-name
[<a class="results" data-name="result-name" href="/xxy&gt; my text&lt;/a&gt;&lt;/h2&gt;"></a>]

select classes or ids选择类或 ID

soup.select('a.gamers') # select an `a` tag with the class gamers
soup.select('a#gamer') # select an `a` tag with the id gamer

select single attr:选择单个属性:

soup.select('a[attr="value"]')

select multiple attr:选择多个属性:

attr_dict = {
             'attr1': 'val1',
             'attr2': 'val2',
             'attr3': 'val3'
            }

soup.findAll('a', attr_dict)

you can use any CSS selector in soup.select你可以在soup.select使用任何CSS选择器

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM