简体   繁体   中英

Download data using selenium

I am a research analyst trying to collate data and perform analysis.I need data from this page . I need data of Abrasives to vanspati Oils (you'll find it on left side). I always encounter problems like this, I figured out that selenium will be able to handle such stuff. But I am stuck on how to download this data into Excel. I need one excel sheet for each category. My exact technical question is how do I address the problem of downloading the table data.I did a little bit of background research and understood that the data can be extracted if the table has class_name. from here . I see that the table has class="tbldata14 bdrtpg" So I used it in my code. I got this error

InvalidSelectorException: Message: The given selector tbldata14 bdrtpg is either invalid or does not result in a WebElement.

How can I download this table data? Point me to any references that I can read and solve this problem. My code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()

driver.get("http://www.moneycontrol.com/stocks/marketinfo/netprofit/bse/index.html")
elem=driver.find_element_by_class_name("tbldata14 bdrtpg")

Thanks in advance.Also please suggest if there is another simple way [I tried copy paste it is too tedious!]

Fetching the data you're interesting in can be achieved as following,

from selenium import webdriver

url = "http://www.moneycontrol.com/stocks/marketinfo/netprofit/bse/index.html"

# Get table-cells where the cell contains an anchor or text   
xpath = "//table[@class='tbldata14 bdrtpg']//tr//td[child::a|text()]"

driver = webdriver.Firefox()    
driver.get(url)
data = driver.find_elements_by_xpath(xpath)

# Group the output where each row contains 5 elements
rows=[data[x:x+5] for x in xrange(0, len(data), 5)]
for r in rows:
    print "Company {}, Last Price {}, Change {}, % Change {}, Net Profit {}" \
        .format(r[0].text, r[1].text, r[2].text, r[3].text, r[4].text)

Writing the data to an excel file is explained here,

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM