Click and scrape 'a href' links by class name using Selenium in Python

Question

I have the following a href link with only a class identifier. I'm trying to have Selenium recursively click through each link, but Selenium isn't returning the proper page sources from each 'a href' links.

<div class="row">
 <div class="item">
  ↳<a href /path/to/link/ class="link-box">
 <div class="item">
 <div class="item">
 <div class="item">

What am I doing wrong here:

driver = webdriver.Chrome("/Users/me/Downloads/chromedriver", options=options)
driver.get("https://the_website")
link_box = driver.find_elements_by_class_name('link-box')

for i in range(len(link_box)):
  driver.execute_script("arguments[0].click();", link_box[i])
page_source = driver.page_source
pprint(page_source)

Answer 1

I wrote another code to do it.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
#driver = webdriver.Chrome(executable_path='chromedriver.exe')
driver = webdriver.Firefox(executable_path='geckodriver')
driver.get("url")
l=[]
for a in driver.find_elements_by_class_name('link-box'):
    link = a.get_attribute('href')
    l.append(link)
print(l)

for b in range(len(l)):
    driver.execute_script("window.open('');")
    driver.switch_to.window(driver.window_handles[b+1]) 
    driver.get(l[b])
    print(l[b])

First, it will take all the link which has class link-box. Then it will open all the links in new tabs. Otherwise, there might be an error. I did this with Firefox but if you are doing with Chrome comment line 4 and uncomment line 3 . Then give the right path.

Click and scrape 'a href' links by class name using Selenium in Python

Question

1 answers

solution1
0 ACCPTED 2020-05-29 05:35:19

Click and scrape 'a href' links by class name using Selenium in Python

Question

1 answers

solution1 0 ACCPTED 2020-05-29 05:35:19

solution1
0 ACCPTED 2020-05-29 05:35:19