I'm trying to scrape the price from link: https://www.kbb.com/cadillac/deville/1996/sedan-4d/
The prices are shown in text tag inside svg tag.
When i use the xpath: .//*[name()='svg']//*[name()='g']//*[name()='text']
inside the browser's inspect element, I'm able to find the tags. But the same xpath is not working in the code.
The current code is:
def get_price(url):
driver.get(url)
time.sleep(10)
try:
price_xpaths = driver.find_elements_by_xpath(".//*[name()='svg']//*[name()='g']//*[name()='text']")
except:
print("price not found")
for p in price_tags:
print(p.text)
I get a blank list in return of function find_elements_by_xpath when I run the above code. I tried other things as well like switching to default content because the element is in #document
driver.switch_to_default_content()
but this also didn't work out well. If there is any other way to scrape price, please let me know.
It is external SVG
and it seems Selenium doesn't have it in DOM so I had to get <object>
which has url to this SVG
file, get this url in data
, download it using requests
and get text using BeautifulSoup
from selenium import webdriver
import time
import requests
from bs4 import BeautifulSoup
url = 'https://www.kbb.com/cadillac/deville/1996/sedan-4d/'
driver = webdriver.Firefox()
driver.get(url)
time.sleep(5)
# doesn't work - always empty list
#price_xpaths = driver.find_elements_by_xpath(".//*[name()='svg']//*[name()='g']//*[name()='text']")
#price_xpaths = driver.find_elements_by_xpath('//svg')
#price_xpaths = driver.find_elements_by_xpath('//svg//g//text')
#price_xpaths = driver.find_elements_by_xpath('//*[@id="PriceAdvisor"]')
#print(price_xpaths) # always empty list
# single element `object`
svg_item = driver.find_element_by_xpath('//object[@id="PriceAdvisorFrame"]')
# doesn't work - always empty string
#print(svg_item.get_attribute('innerHTML'))
# get url to file SVG
svg_url = svg_item.get_attribute('data')
print(svg_url)
# download it and parse
r = requests.get(svg_url)
soup = BeautifulSoup(r.content, 'html.parser')
text_items = soup.find_all('text')
for item in text_items:
print(item.text)
Result:
Fair Market Range
$1,391 - $2,950
Fair Purchase Price
$2,171
Typical
Listing Price
$2,476
BTW: Information for other users: I had to use proxy/ VPN with IP located in US
to see this page. For location PL
it displays
Access Denied.
You don't have permission to access "http://www.kbb.com/cadillac/deville/1996/sedan-4d/" on this server.
Sometimes even for location in US
it gives me this message.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.