I was attempting to solve this issue for a bit of time and attempted multiple solution posted on here prior to opening this question.
I am currently attempting to a run a scraper with the following code
website = 'https://www.abitareco.it/nuove-costruzioni-milano.html'
path = Path().joinpath('util', 'chromedriver')
driver = webdriver.Chrome(path)
driver.get(website)
main = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.NAME, "p1")))
My goal hyperlink has word scheda
in it:
i = driver.find_element_by_xpath('.//a[contains(@href, "scheda")]')
i.text
My first issue is that find_element_by_xpath
only outputs a single hyperlink and second issue is that it is not extracting anything so far.
I'd appreciate any help and/or guidance.
You need to use find_elements
instead :
for name in driver.find_elements(By.XPATH, ".//a[contains(@href, 'scheda')]"):
print(name.text)
Note that find_elements
will return a list of web elements, where as find_element
return a single web element
.
if you specifically looking for href
attribute
then you can try the below code :
for name in driver.find_elements(By.XPATH, ".//a[contains(@href, 'scheda')]"):
print(name.get_attribute('href'))
There's 2 issues, looking at the website.
Assuming what you want is to scrape the URLs of all these links, you can use .get_attribute('href') instead of .text, like so:
url_list = driver.find_elements(By.XPATH, './/a[contains(@href, "scheda")]')
for i in url_list:
print(i.get_attribute('href'))
It will detect all webelements that match you criteria and store them in a list. I just used print as an example, but obviously you may want to do more than just print the links.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.