使用美麗的湯解析表

Question

我一直在為《美麗的湯》和一個網頁而苦苦掙扎。 我想從網頁中解析特定的表格，但是遇到了問題。 我的代碼如下：

# -*- coding: cp1252 -*-
import urllib2

from bs4 import BeautifulSoup

page    =     urllib2.urlopen("http://www.snet.gob.sv/googlemaps/workstation/main.php").read()
soup    = BeautifulSoup(page)


data = []
table = soup.find("table", { "class" : "mytable" })
table_body = table.find('tbody')

rows = table_body.find_all('tr')
for row in rows:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele]) # Get rid of empty values

print data

它適用於其他網頁，但不適用於該網頁。 我收到以下錯誤：

table_body = table.find('tbody')
AttributeError: 'NoneType' object has no attribute 'find'

似乎沒有找到標簽“ tbody”，但是我已經檢查過了並且它在代碼中。 另一個問題是，當它起作用時（其他網頁），該表的每個項目旁邊都會出現一個“ u”。 我已經搜索了很多，但找不到問題。 謝謝你的幫助。

Answer 1

不，錯誤-

AttributeError: 'NoneType' object has no attribute 'find'

表示該table為None ，這意味着該函數-

soup.find("table", { "class" : "mytable" })

返回None ，表示該頁面沒有任何表，該表的屬性類的值為mytable 。

您不能僅假設跨不同網頁的html完全相同（否則所有網頁的外觀都將完全相同）。

我檢查了url，確實沒有該類的表，該特定頁面中的任何表都沒有任何類。 您將需要確定要查找的表並相應地指定條件。

使用美麗的湯解析表

問題描述

1 個解決方案

解決方案1
1 2015-08-15 05:16:37

使用美麗的湯解析表

問題描述

1 個解決方案

解決方案1 1 2015-08-15 05:16:37

解決方案1
1 2015-08-15 05:16:37