Get value of attribute using CSS Selectors with BeutifulSoup

Question

I am web-scraping with Python and using BeutifulSoup library

I have HTML markup like this:

<tr class="deals" data-url="www.example2.com">
<span class="hotel-name">
<a href="www.example2.com"></a>
</span>
</tr>
<tr class="deals" data-url="www.example3.com">
<span class="hotel-name">
<a href="www.example3.com"></a>
</span>
</tr>

I want to get the data-url or the href value in all <tr> s. Better If I can get href value

Here is a little snippet of my relevant code:

main_url =  "http://localhost/test.htm"
page  = requests.get(main_url).text
soup_expatistan = BeautifulSoup(page)

print (soup_expatistan.select("tr.deals").data-url)
# or  print (soup_expatistan.select("tr.deals").["data-url"])

Answer 1

You can use tr.deals span.hotel-name a CSS Selector to get to the link:

from bs4 import BeautifulSoup

data = """
<tr class="deals" data-url="www.example.com">
<span class="hotel-name">
<a href="wwwexample2.com"></a>
</span>
</tr>
"""

soup = BeautifulSoup(data)
print(soup.select('tr.deals span.hotel-name a')[0]['href'])

Prints:

wwwexample2.com

If you have multiple links, iterate over them:

for link in soup.select('tr.deals span.hotel-name a'):
    print(link['href'])

Get value of attribute using CSS Selectors with BeutifulSoup

Question

1 answers

solution1
3 ACCPTED 2014-11-07 14:22:36

Get value of attribute using CSS Selectors with BeutifulSoup

Question

1 answers

solution1 3 ACCPTED 2014-11-07 14:22:36

solution1
3 ACCPTED 2014-11-07 14:22:36