简体   繁体   中英

How to scrape data generated by javascript using python

I want to scrape the Number of participants of the following news. The url is http://news.sina.com.cn/c/2013-07-11/175827642839.shtml And I want to get the Number 820. It is generated by javascript. How can I get that number using simple way?

You could analize javascript code and do the same in python. Or you can use Selenium in Python.

edit:

Here example from selenium page changed to do what you need.

It open browser (firefox), wait 5 second (to load page) and get text

#!/usr/bin/python

import selenium
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import time

browser = webdriver.Firefox() # Get local session of firefox
browser.get("http://news.sina.com.cn/c/2013-07-11/175827642839.shtml ") # Load page
time.sleep(5) # Let the page load
try:
    element = browser.find_element_by_xpath("//span[contains(@class,'f_red')]") # get element on page
    print element.text # get element text
except NoSuchElementException:
    assert 0, "can't find f_red"
browser.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM