简体   繁体   中英

parsing html table using beautiful soup

I wrote this code for printing a table as seen in here http://www.medindia.net/drug-price/list.asp

import mechanize
import urllib2
from bs4 import BeautifulSoup

med="paracetamol"
br=mechanize.Browser()
br.set_handle_robots(False)
res=br.open("http://www.medindia.net/drug-price/")
br.select_form("frmdruginfo_search")
br.form['druginfosearch']=med
br.submit()
url=br.response().geturl()
print url
web_page = urllib2.urlopen(url)
soup = BeautifulSoup(web_page)
tabl=soup.find_all('table')
rows=tabl.find_all('tr')

for tr in rows:
        cols=tr.find_all('td')
        for td in cols:
              text = ''.join(td.find(text=True))
              print text+"|",

But while I execute the same I get this error

 rows=tabl.find_all('tr')
    AttributeError: 'list' object has no attribute 'find_all'

Can anyone please help me to solve this? Thanks!

soup.find_all('table') returns a list of matched tables, you just need the one - use find() :

tabl = soup.find('table', {'class': 'content-table'})
rows = tabl.find_all('tr')

Also note that I'm explicitly saying that I need a table with a specific class.

Also you don't need to make a separate urllib2 call to the page - just use br.response().read() for getting an actual html for BS to parse.

Just FYI, if you want a better formatted table results on a console, consider using texttable :

import mechanize
from bs4 import BeautifulSoup
import texttable


med = raw_input("Enter the drugname: ")
br = mechanize.Browser()
br.set_handle_robots(False)
res = br.open("http://www.medindia.net/drug-price/")
br.select_form("frmdruginfo_search")
br.form['druginfosearch'] = med
br.submit()

soup = BeautifulSoup(br.response().read())

tabl = soup.find('table', {'class': 'content-table'})
table = texttable.Texttable()
for tr in tabl.find_all('tr'):
    table.add_row([td.text.strip() for td in tr.find_all('td')])

print table.draw()

prints:

+--------------+--------------+--------------+--------------+--------------+
| SNo          | Prescribing  | Total No of  | Single       | Combination  |
|              | Information  | Brands       |     Generic  |     of       |
|              |              | (Single+Comb |              | Generic(s)   |
|              |              | ination)     |              |              |
+--------------+--------------+--------------+--------------+--------------+
| 1            | Abacavir     | 6            | View Price   | -            |
+--------------+--------------+--------------+--------------+--------------+
| 2            | Abciximab    | 1            | View Price   | -            |
+--------------+--------------+--------------+--------------+--------------+
| 3            | Acamprosate  | 3            | View Price   | -            |
+--------------+--------------+--------------+--------------+--------------+
| 4            | Acarbose     | 41           | View Price   | -            |
+--------------+--------------+--------------+--------------+--------------+
...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM