使用Python和Beautiful Soup從表中獲取列

Question

我是Python的新手，我想從表中獲取數據的“價格”列，但是我無法檢索該數據。

目前我在做什么：

# Libraies
from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen("http://pythonscraping.com/pages/page3.html")
soup = BeautifulSoup(html, "html.parser")
table = soup.find("table")

for row in table.find_all("tr"):

    col = row.find_all("td")

    print(col[2])
    print("---")

我一直在獲取超出值范圍的列表索引。 我已經閱讀了文檔並嘗試了幾種不同的方法，但是我似乎無法理解它。

另外，我正在使用Python3。

Answer 1

問題是您要遍歷表中的所有tr ，並且在開頭不需要1個標頭tr ，因此請避免使用該頭：

    # Libraies
from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen("http://pythonscraping.com/pages/page3.html")
soup = BeautifulSoup(html, "html.parser")
table = soup.find("table")

for row in table.find_all("tr")[1:]:

    col = row.find_all("td")

    print(col[2])
    print("---")

Answer 2

可能意味着其中一行沒有td標簽。 您可以嘗試將print或col[2]任何用法包裝在try除塊中，並忽略col為空或少於三個項目的情況

for row in table.find_all("tr"):

    col = row.find_all("td")
    try:  
        print(col[2])
        print("---")
    except IndexError:
        pass

使用Python和Beautiful Soup從表中獲取列

問題描述

2 個解決方案

解決方案1
1 已采納 2017-03-03 22:57:25

解決方案2
0 2017-03-03 22:56:13

使用Python和Beautiful Soup從表中獲取列

問題描述

2 個解決方案

解決方案1 1 已采納 2017-03-03 22:57:25

解決方案2 0 2017-03-03 22:56:13

解決方案1
1 已采納 2017-03-03 22:57:25

解決方案2
0 2017-03-03 22:56:13