简体   繁体   中英

beautifulsoup - how to find tags starting with certain attribute?

For example, I have:

<a class="banana" href="http://example.com">link1</a>
<a href="http://example2.com" class="banana"><img ... /></a>
<a class="banana">link2</a>
<a href="http://google.com">link3</a>

How I can get:

['<a href="http://example2.com" class="banana"><img ... /></a>','<a href="http://google.com">link3</a>']

You can use css selector a[href] to get a tags with href attribute:

h = '''
<a class="banana" href="http://example.com">link1</a>
<a href="http://example2.com" class="banana"><img ... /></a>
<a class="banana">link2</a>
<a href="http://google.com">link3</a>
'''

from bs4 import BeautifulSoup
soup = BeautifulSoup(h)
print(soup.select('a[href]'))

output:

[<a class="banana" href="http://example.com">link1</a>,
 <a class="banana" href="http://example2.com"><img ...=""/></a>,
 <a href="http://google.com">link3</a>]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM