scraping news website aggregator by clicking on more news button using selenium

Question

I want to scrape news headlines from this link: https://www.newsnow.co.uk/h/Business+&+Finance?type=ln

I want to expand news by clicking (using selenium) on the button 'view more headlines' to collect the max number of news headlines possible

I created this code but failed to make the click to expand news:

import time
from selenium import webdriver
u = 'https://www.newsnow.co.uk/h/Business+&+Finance?type=ln'

driver = webdriver.Chrome(executable_path=r"C:\chromedriver.exe")
driver.get(u)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")    
driver.implicitly_wait(60) # seconds

elem = driver.find_element_by_css_selector('span:contains("view more headlines")')
for i in range(10):
    elem.click()
    time.sleep(5)
    print(f'click {i} done')

returns: selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: An invalid or illegal selector was specified

I tried using xpath selector:

elem = driver.find_element_by_xpath('//[@id="nn_container"]/div[2]/main/div[2]/div/div/div[3]/div/a')

returns: selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element <a class="rs-button-more js-button-more btn--primary btn--primary--no-spacing" href="#">...</a> is not clickable at point (353, 551). Other element would receive the click: <div class="alerts-scroller">...</div> selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element <a class="rs-button-more js-button-more btn--primary btn--primary--no-spacing" href="#">...</a> is not clickable at point (353, 551). Other element would receive the click: <div class="alerts-scroller">...</div>

Answer 1

The click button gets covered by an overlay element after the click. So, we use javascript to get to it after the first click. Here is the working program.

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
u = 'https://www.newsnow.co.uk/h/Business+&+Finance?type=ln'

driver = webdriver.Chrome(executable_path=r"C:\bin\chromedriver.exe")
driver.maximize_window()
driver.get(u)
time.sleep(10)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
for i in range(10):
        element =WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CLASS_NAME,'btn--primary__label')))
        driver.execute_script("arguments[0].scrollIntoView();", element)
        element.click()
        time.sleep(5)

        print(f'click {i} done')

Answer 2

This one is the correct XPath:

driver.find_element_by_xpath(r'//*[@id="nn_container"]/div[2]/main/div[2]/div/div/div[3]/div/a').click()

scraping news website aggregator by clicking on more news button using selenium

Question

2 answers

solution1
1 ACCPTED 2020-11-28 08:14:08

solution2
0 2020-11-28 08:18:28

scraping news website aggregator by clicking on more news button using selenium

Question

2 answers

solution1 1 ACCPTED 2020-11-28 08:14:08

solution2 0 2020-11-28 08:18:28

solution1
1 ACCPTED 2020-11-28 08:14:08

solution2
0 2020-11-28 08:18:28