如何从网站获取某些链接，而不是所有链接？

Question

Here's what I have so far:这是我到目前为止所拥有的：

import requests
from bs4 import BeautifulSoup

def linkScraper():
    html = requests.get("https://www.bbc.com/").text
    soup = BeautifulSoup(html, 'html.parser')
    
    for link in soup.find_all('a'):
        print(link.get('href'))

But this prints every single link on the website.但这会打印网站上的每个链接。 How can I configure this to give me the links to the articles that appear on the BBC's homepage?我如何配置它以提供指向出现在 BBC 主页上的文章的链接？

Answer 1

您可以使用列表理解对其进行过滤：

links = [link for link in soup.find_all('a') if link.startswith('https://www.bbc.com/')]

如何从网站获取某些链接，而不是所有链接？

问题描述

1 个解决方案

解决方案1
1 2021-10-26 16:43:13

如何从网站获取某些链接，而不是所有链接？

问题描述

1 个解决方案

解决方案1 1 2021-10-26 16:43:13

解决方案1
1 2021-10-26 16:43:13