Python Scraping JavaScript using Selenium and Beautiful Soup

Question

I'm trying to scrape a JavaScript enables page using BS and Selenium. I have the following code so far. It still doesn't somehow detect the JavaScript (and returns a null value). In this case I'm trying to scrape the Facebook comments in the bottom. (Inspect element shows the class as postText)
Thanks for the help!

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException  
from selenium.webdriver.common.keys import Keys  
import BeautifulSoup

browser = webdriver.Firefox()  
browser.get('http://techcrunch.com/2012/05/15/facebook-lightbox/')  
html_source = browser.page_source  
browser.quit()

soup = BeautifulSoup.BeautifulSoup(html_source)  
comments = soup("div", {"class":"postText"})  
print comments

Answer 1

There are some mistakes in your code that are fixed below. However, the class "postText" must exist elsewhere, since it is not defined in the original source code. My revised version of your code was tested and is working on multiple websites.

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException  
from selenium.webdriver.common.keys import Keys  
from bs4 import BeautifulSoup

browser = webdriver.Firefox()  
browser.get('http://techcrunch.com/2012/05/15/facebook-lightbox/')  
html_source = browser.page_source  
browser.quit()

soup = BeautifulSoup(html_source,'html.parser')  
#class "postText" is not defined in the source code
comments = soup.findAll('div',{'class':'postText'})  
print comments

Python Scraping JavaScript using Selenium and Beautiful Soup

Question

1 answers

solution1
7 2014-03-22 05:04:18

Python Scraping JavaScript using Selenium and Beautiful Soup

Question

1 answers

solution1 7 2014-03-22 05:04:18

solution1
7 2014-03-22 05:04:18