python beautiful soup extract href

Question

so I'm testing Beautiful soup with python (it's great for those how are wondering)

I have a problem when i want to get the href from a link that i got, and i don't understand why I can't get it.

here is my code:

for url in soup.find_all('article'):
if "Gonz Logo" in url.get_text():
    if "Black" in url.get_text():
        print(url)

this works but it gives me this:

<article><div class="inner-article"><a href="/shop/jackets/gw1diqgyr/n53istanq" style="height:150px;"><img alt="N7qmqyee 3g" height="150" src="//assets.supremenewyork.com/147789/vi/N7qMqyEe_3g.jpg" width="150"/></a><h1><a class="name-link" href="/shop/jackets/gw1diqgyr/n53istanq">Gonz Logo Coaches Jacket </a></h1><p><a class="name-link" href="/shop/jackets/gw1diqgyr/n53istanq">Black</a></p></div></article>

(yeah a big line...)

the probleme is i only want to get the href. when I try:

    print(url.get('href'))

I get in output : None

I have no idea why.

thank you for your answers!

Answer 1

I think you get None because of soup.find_all('article') . And when you do url.get('href') you don't get the link.

To get the link I would recommend you to get all a tags using regex, for eg:

links = soup.findAll('a', attrs={'href': re.compile('[a-zA-Z0-9_()]')})
# now iterate over the links and
for link in links:
    # get url
    url = link.get('href')
    print(url)

Answer 2

can you try this?

for url in soup.find_all('article'):
if "Gonz Logo" in url.get_text():
    if "Black" in url.get_text():
        for child_a in url.find_all('a'):
           print(child_a['href'])

Answer 3

By slighly modifying Ali Yilmaz's solution as following (href=True):

for url in soup.find_all('article'):
if "Gonz Logo" in url.get_text():
    if "Black" in url.get_text():
        for child_a in url.find_all('a', href=True):
           print(child_a['href'])

It works fine

python beautiful soup extract href

Question

3 answers

solution1
2 2018-05-13 20:50:16

solution2
1 2018-05-13 20:39:03

solution3
0 2019-11-12 11:17:03

python beautiful soup extract href

Question

3 answers

solution1 2 2018-05-13 20:50:16

solution2 1 2018-05-13 20:39:03

solution3 0 2019-11-12 11:17:03

solution1
2 2018-05-13 20:50:16

solution2
1 2018-05-13 20:39:03

solution3
0 2019-11-12 11:17:03