如何用python和beautifulsoup解析html表並寫入csv

Question

我嘗試解析html頁面並獲取貨幣值並寫入csv。 我有以下代碼：

#!/usr/bin/env python

import urllib2
from BeautifulSoup import BeautifulSoup

contenturl = "http://www.bank.gov.ua/control/en/curmetal/detail/currency?period=daily"
soup = BeautifulSoup(urllib2.urlopen(contenturl).read())

table = soup.find('div', attrs={'class': 'content'})

rows = table.findAll('tr')
for tr in rows:
    cols = tr.findAll('td')
    for td in cols:
        text = td.find(text=True) + ';'
        print text,
    print

問題是，我不知道，如何只檢索貨幣的值。 我嘗試了一些像'^ [0-9] {3}'這樣的正則表達式 - 以3位開頭，但它不起作用。

Answer 1

你最好選擇表格中的特定細胞。 具有cell_c類的td單元格包含您感興趣的數據，最后一個單元格始終是貨幣匯率：

rows = table.findAll('tr')
for tr in rows:
    cols = tr.findAll('td')
    if 'cell_c' in cols[0]['class']:
        # currency row
        digital_code, letter_code, units, name, rate = [c.text for c in cols]
        print digital_code, letter_code, units, name, rate

使用單獨變量中的數據，您現在可以將文本轉換為十進制數，將它們存儲在數據庫中，無論如何。

如何用python和beautifulsoup解析html表並寫入csv

問題描述

1 個解決方案

解決方案1
9 已采納 2013-03-06 14:59:18

如何用python和beautifulsoup解析html表並寫入csv

問題描述

1 個解決方案

解決方案1 9 已采納 2013-03-06 14:59:18

解決方案1
9 已采納 2013-03-06 14:59:18