简体   繁体   中英

Scraping hidden product details on a webpage using Selenium

Sorry I am a Selenium noob and have done a lot of reading but am still having trouble getting the product price (£0.55) from this page: https://groceries.asda.com/product/spaghetti-tagliatelle/asda-spaghetti/36628 . Product details are not visible when parsing the html using bs4. Using Selenium I can get a string of the entire page and can see the price in there (using the following code). I should be able to extract the price from this somehow but would prefer a less hacky solution.

browser = webdriver.Firefox(executable_path=r'C:\Users\Paul\geckodriver.exe')
browser.get('https://groceries.asda.com/product/tinned-tomatoes/asda-smart-price-chopped-tomatoes-in-tomato-juice/19560')
content = browser.page_source

If I run something like this:

elem = driver.find_element_by_id("bodyContainerTemplate")
print(elem)

It just returns: selenium.webdriver.firefox.webelement.FirefoxWebElement (session="df23fae6-e99c-403c-a992-a1adf1cb8010", element="6d9aac0b-2e98-4bb5-b8af-fcbe443af906")

The price is the text associated with this element: p class="prod-price" but I cannot seem to get this working. How should I go about getting this text (the product price)?

The type of elem is WebElement . If you need to extract text value of web-element you might use below code:

elem = driver.find_element_by_class_name("prod-price-inner")
print(elem.text)

Try this solution, it works with selenium and beautifulsoup

from bs4 import BeautifulSoup
from selenium import webdriver

url='https://groceries.asda.com/product/spaghetti-tagliatelle/asda-spaghetti/36628'

driver = webdriver.PhantomJS()
driver.get(url)

data = driver.page_source

soup = BeautifulSoup(data, 'html.parser')

ele = soup.find('span',{'class':'prod-price-inner'})

print ele.text

driver.quit()

It will print :

£0.55

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM