使用 Beautiful soup 抓取 web 页面

Question

import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
url = input('Enter -')
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html,'html.parser')

tags = soup('a')
for tag in tags:
    print(tag.get('herf',None))

I used this link to test my code http://www.dr-chuck.com/page1.htm我使用此链接测试我的代码http://www.dr-chuck.com/page1.htm

The output is: NONE output 是： NONE

the output should be this link http://www.dr-chuck.com/page2.htm output 应该是这个链接http://www.dr-chuck.com/page2.htm

Answer 1

Simple typo, there.简单的错字，那里。

Change 'herf'to 'href'in tags.get在 tags.get 中将 'herf' 更改为 'href'

  import urllib.request, urllib.parse, urllib.error
    from bs4 import BeautifulSoup
    url = input('Enter -')
    html = urllib.request.urlopen(url).read()
    soup = BeautifulSoup(html,'html.parser')

    tags = soup('a')
    for tag in tags:
        print(tag.get('href',None))

outputs输出

#http://www.dr-chuck.com/page2.htm

使用 Beautiful soup 抓取 web 页面

问题描述

1 个解决方案

解决方案1
2 2020-05-10 06:55:51

使用 Beautiful soup 抓取 web 页面

问题描述

1 个解决方案

解决方案1 2 2020-05-10 06:55:51

解决方案1
2 2020-05-10 06:55:51