简体   繁体   中英

Error in Xpath when scraping last close price of a bond using Selenium

I am unable to pull the last traded price of a bond on finra. I tried using Beautiful Soup which was unable to locate the div tag and I am currently trying to use Selenium to get the same.

I have attached a screenshot of my code and the error message I get when trying to execute my code.

Thanks in advance.

My Code:

from selenium.webdriver.common.keys import Keys
import pandas as pd

driver = webdriver.Chrome(My Path Here)
url = 'http://finra-markets.morningstar.com/BondCenter/BondDetail.jsp?ticker=FSBIN4730902&symbol=SBIN4730902'
driver.get(url)
xpath = '//span[@id="price"'
Market_Price = driver.find_element_by_xpath(xpath)
driver.close()
print(Market_Price)```


Error message : `selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: Unable to locate an element with the xpath expression //span[@id="price" because of the following error:
SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//span[@id="price"' is not a valid XPath expression.`


The error message is pretty descriptive in this case:

SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//span[@id="price"' is not a valid XPath expression.`

It means that your xpath expression is not valid as it's missing the closing ] .

It should look like this:

xpath = '//span[@id="price"]'

If you look at the network tab you can see the call back which returns that price dynamically. You can issue a simplified version of that request and regex out the required value from the returned string. You need to issue an initial request with Session to the original url in order to pick up the srtqs cookie.

import requests, re

params = (
    ('t', 'FSBIN4730902'),
    ('region', 'usa'),
    ('culture', 'en-US'),
    ('productcode', 'QS'),
    ('cur', ''),
    ('refresh', 'true'),
    ('callback', 'jQuery16407409943162048542_1609677517942'),
)

with requests.Session() as s:
    r = s.get('http://finra-markets.morningstar.com/BondCenter/BondDetail.jsp?ticker=FSBIN4730902&symbol=SBIN4730902')
    r = s.get('http://quotes.morningstar.com/bondq/quote/c-banner?&t=FSBIN4730902&region=usa&culture=en-US&productcode=QS&cur=&refresh=true&callback=jQuery16407409943162048542_1609677517942', params=params)
    print(re.search('"price.*\$([0-9.]+)', r.text).groups(0)[0])

Regex explanation:

在此处输入图像描述

You have several issues:

  1. Your xpath expression is incorrect; it is missing an '*'.
  2. You should code an implicitly_wait call to allow time for the element to appear.
  3. The third reason why you couldn't find the element is because it is located in an <iframe> to which you must first switch.
  4. You should issue a driver.quit call rather a driver.close call to terminate the driver process in addition to just closing the window.
from selenium import webdriver


driver = webdriver.Chrome(My Path Here)

try:
    url = 'http://finra-markets.morningstar.com/BondCenter/BondDetail.jsp?ticker=FSBIN4730902&symbol=SBIN4730902'
    driver.get(url)
    driver.implicitly_wait(10)
    driver.switch_to.frame('ms-bond-detail-iframe')
    xpath = '//*[@id="price"]'
    Market_Price = driver.find_element_by_xpath(xpath)
    print(Market_Price)
finally:
    driver.quit()

Prints:

<selenium.webdriver.remote.webelement.WebElement (session="357c5c44e981ff0adde57969c24fb638", element="a9c141e2-8f35-4ff0-9970-90d083dd0476")>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM