使用漂亮的湯和 python3 不斷收到“TypeError: 'NoneType' 對象不可調用”

Question

我是一個初學者並且在學習一門課程時正在苦苦掙扎，所以這個問題可能真的很簡單，但是我正在運行這個（無可否認的凌亂）代碼（保存在文件 x.py 下）以從具有以下行格式的網站中提取鏈接和名稱：

<li style="margin-top: 21px;">
  <a href="http://py4e-data.dr-chuck.net/known_by_Prabhjoit.html">Prabhjoit</a>
</li>

所以我設置了這個： import urllib.request, urllib.parse, urllib.error from bs4 import BeautifulSoup import ssl # 忽略 SSL 證書錯誤 ctx = ssl.create_default_context() ctx.check_hostname = False ctx.verify_mode = ssl.CERT_NONE

url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
for line in soup:
    if not line.startswith('<li'):
        continue
    stuff = line.split('"')
    link = stuff[3]
    thing = stuff[4].split('<')
    name = thing[0].split('>')
    count = count + 1
    if count == 18:
        break
print(name[1])
print(link)

它不斷產生錯誤：

Traceback (most recent call last):
  File "x.py", line 15, in <module>
    if not line.startswith('<li'):
TypeError: 'NoneType' object is not callable

我已經為此苦苦掙扎了幾個小時，如果您有任何建議，我將不勝感激。

Answer 1

line不是字符串，並且它沒有startswith()方法。 它是一個BeautifulSoup Tag對象，因為 BeautifulSoup 已經將 HTML 源文本解析為一個豐富的對象模型。 不要試圖將其視為文本！

該錯誤是因為如果您訪問Tag對象上不知道的任何屬性，它會搜索具有該名稱的子元素（因此在這里它執行line.find('startswith') ），並且由於沒有具有該名稱的元素，返回None 。 None.startswith()然后失敗並顯示您看到的錯誤。

如果您想找到第 18 個<li>元素，只需向 BeautifulSoup 詢問該特定元素：

soup = BeautifulSoup(html, 'html.parser')
li_link_elements = soup.select('li a[href]', limit=18)
if len(li_link_elements) == 18:
    last = li_link_elements[-1]
    print(last.get_text())
    print(last['href'])

這使用CSS 選擇器來僅查找父元素是<li>元素且具有href屬性的<a>鏈接元素。 搜索僅限於 18 個這樣的標簽，並打印最后一個，但前提是我們確實在頁面中找到了 18 個。

使用Element.get_text()方法檢索元素文本，該方法將包括來自任何嵌套元素（例如<span>或<strong>或其他額外標記）的文本，並且使用標准索引符號訪問href屬性。

使用漂亮的湯和 python3 不斷收到“TypeError: 'NoneType' 對象不可調用”

問題描述

1 個解決方案

解決方案1
1 已采納 2018-08-27 17:06:48

使用漂亮的湯和 python3 不斷收到“TypeError: &#39;NoneType&#39; 對象不可調用”

問題描述

1 個解決方案

解決方案1 1 已采納 2018-08-27 17:06:48

使用漂亮的湯和 python3 不斷收到“TypeError: 'NoneType' 對象不可調用”

解決方案1
1 已采納 2018-08-27 17:06:48