简体   繁体   English

web 用python刮/美汤

[英]web scraping with python/beautiful soup

So i am learning how to web scrape.所以我正在学习如何 web 刮擦。 I am currently trying to find all of the social links in this code我目前正在尝试查找此代码中的所有社交链接

    <ul class="socials">
   <li class="social instagram">
    <b>
     Instagram:
    </b>
    <a href="https://www.instagram.com/keithgalli/">
     https://www.instagram.com/keithgalli/
    </a>
   </li>
   <li class="social twitter">
    <b>
     Twitter:
    </b>
    <a href="https://twitter.com/keithgalli">
     https://twitter.com/keithgalli
    </a>
   </li>
   <li class="social linkedin">
    <b>
     LinkedIn:
    </b>
    <a href="https://www.linkedin.com/in/keithgalli/">
     https://www.linkedin.com/in/keithgalli/
    </a>
   </li>
   <li class="social tiktok">
    <b>
     TikTok:
    </b>
    <a href="https://www.tiktok.com/@keithgalli">
     https://www.tiktok.com/@keithgalli
    </a>
   </li>

It is clearly the links in the anchor tags but i am having issues with the find_all command and when i try to use it i am only getting back one of the social links.显然是锚标签中的链接,但我遇到了 find_all 命令的问题,当我尝试使用它时,我只能取回其中一个社交链接。 The code im putting in is我输入的代码是

href = soup.find_all("a")
print(href)

and the out put is输出是

[<a href="https://keithgalli.github.io/web-scraping/webpage.html">keithgalli.github.io/web-scraping/webpage.html</a>]

I am not exactly sure on what i am doing wrong.我不确定我做错了什么。 I thought that if i targeted the href that it would grab all of the hrefs..Any hints or direction would be greatly appreciated.我认为,如果我以 href 为目标,它将抓住所有的 href。任何提示或方向将不胜感激。

try this:尝试这个:

for href in soup.find_all("a"):
    print(href)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM