简体   繁体   中英

Python BeautifulSoup URL

I want to ask is that how can I just get a part of URL in the tag duing BeautifulSoup

Here is the href tag that return by BeautifulSoup:

<a href="https://www.goodsmile.info/zh/products/category/nendoroid_series/announced/2020" ping="/url?sa=t&amp;source=web&amp;rct=j&amp;url=https://www.goodsmile.info/zh/products/category/nendoroid_series/announced/2020&amp;ved=2ahUKEwjT-_Gy4PzsAhWIyosBHd4ZAAkQFjBvegQIYhAC">

But I just want is:

https://www.goodsmile.info/zh/products/category/nendoroid_series/announced/2020

How can I do?

and here is some of my code:

for hit in soup.find_all(class_='g'):
    Hit_title = hit.find('h3')
    URL=hit.find(class_='yuRUbf').find('a', href=True).get('href')

What I need to modify? Thanks

Use .attrs['href'] instead of .get('href') :

for hit in soup.find_all(class_='g'):
    Hit_title = hit.find('h3')
    URL=hit.find(class_='yuRUbf').find('a', href=True).attrs['href']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM