從網頁中提取鏈接並創建字典Python

Question

編寫一個函數來打開網頁，並返回該頁面上所有鏈接及其文本的字典。 鏈接是字典鍵，文本是字典值。

到目前為止，這就是我所擁有的。

import urllib.request as urlrequest
def getLinks(url):

   page=urlrequest.urlopen(url)

   lines = page.readlines()


   url_list={}
   for line in lines:
      if '<a href=' in line:
          removeHref=line[8:]
          end=removeHref.find('>')
          url=removeHref[0:end]
          removeHref=removeHref[end+1:]
          print (url)
          end2=removeHref.find('<')
          text=removeHref[0:end2]
          print ('%s \n' % text)
          url_list[url] = text



url = input("URL: ")
getLinks(url)

但是，當我輸入鏈接並運行它時，會出現以下錯誤：

 if '<a href=' in line:
 TypeError: a bytes-like object is required, not 'str'

我該如何解決？

Answer 1

您不能檢查string和byte之間的限制，它必須是byte和byte或string和string 。

由於您的網頁是作為byte對象返回的。 你應該做：

if b'<a href=' in line:
     pass # your code here

從網頁中提取鏈接並創建字典Python

問題描述

1 個解決方案

解決方案1
1 2016-06-03 22:51:59

從網頁中提取鏈接並創建字典Python

問題描述

1 個解決方案

解決方案1 1 2016-06-03 22:51:59

解決方案1
1 2016-06-03 22:51:59