I'm trying to scrape all CRD# from the search result from this site https://brokercheck.finra.org/search/genericsearch/list
(You'll need to redo the search when you click on the link, just type some random stuff for the Individual
search)
I'm using driver.find_elements_by_xpath
to target all CRD numbers on each result page. However, I've been playing around with the paths for a while but the webdriver still can't pick up the CRDs from the site.
I currently have (in Python)
crds = driver.find_elements_by_xpath("//md-list-item/div/div/div/div/div/bc-bio-geo-section/div/div/div/div/div/span")
But the result is always empty.
Try to use .find_elements_by_css_selector
like this:
crds = driver.find_elements_by_css_selector("span[ng-bind-html='vm.item.id']")
To print all the CRD#
from the search results within the website https://brokercheck.finra.org/search/genericsearch/grid using Selenium you have to induce WebDriverWait for the visibility_of_all_elements_located()
and you can use either of the following Locator Strategies :
Using CSS_SELECTOR
and get_attribute()
:
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.ng-binding[ng-bind-html='vm.item.id']")))])
Using XPATH and text :
print([my_elem.text for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//span[starts-with(., 'CRD')]//following-sibling::span[1]")))])
Note : You have to add the following imports:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.