简体   繁体   中英

Unable to Open .htm link with Selenium

I cannot open the the link described in the picture with selenium. I have tried to find element by css_selector, link, partial link, xpath. Still no success, program shows no error, but does not click the last link. Here is the picture from the inspect code from the sec website. Picture of Inspect Code . The line of code that wants to open this is in bold.

from bs4 import BeautifulSoup as soup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

PATH = "C:\Program Files (x86)\Misc Programs\chromedriver.exe"

stock = 'KO'
#stock = input("Enter stock ticker: ")
browser = webdriver.Chrome(PATH)

#First SEC search
sec_url = 'https://www.sec.gov/search/search.htm'
browser.get(sec_url)
tikSearch = browser.find_element_by_css_selector('#cik')
tikSearch.click()
tikSearch.send_keys(stock)


Sclick = browser.find_element_by_css_selector('#searchFormDiv > form > fieldset > span > input[type=submit]')
Sclick.click()

formDesc = browser.find_element_by_css_selector('#seriesDiv > table > tbody > tr:nth-child(2) > td:nth-child(1)')
print(formDesc)

doc = browser.find_element_by_css_selector('#documentsbutton')
doc.click()

##Cannot open file
**form = browser.find_element_by_xpath('//*[@id="formDiv"]/div/table/tbody/tr[2]/td[3]/a')
form.click()**


uClient = uReq(sec_url)
page_html = uClient.read()```

 

On Firefox this worked and got https://www.sec.gov/Archives/edgar/data/21344/000002134421000018/a20201231crithrifplan.htm

Pasting that into Chrome directly also works.

But in the script, it indeed did not open and left one stuck at: https://www.sec.gov/Archives/edgar/data/21344/000002134421000018/0000021344-21-000018-index.htm where, oddly, clicking on the link by hand works in the browser that Selenium launched.

It's better with a wait, but if I put time.sleep(5) before your

form = browser.find_element_by_xpath('//*[@id="formDiv"]/div/table/tbody/tr[2]/td[3]/a')

it opens in Chrome.

EDIT: And here it is done properly with no sleep:

wait = WebDriverWait(browser, 20)
wait.until(EC.presence_of_element_located((By.XPATH, '//*[@id="formDiv"]/div/table/tbody/tr[2]/td[3]/a'))).click()

This assumes you have the imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Possibly useful addition: I am surprised there is no Selenium Test Helper out there with methods that wrap in some bulletproofing (or maybe there are and I do not know), like what Hetzner Cloud did in its Protractor Test Helper. So I wrote my own little wrapper method for the click (also for send keys, which calls this one). If it's useful to you or readers, enjoy. It could be enhanced to build in retries or take the wait time or whether to scroll the field into the top or bottom of the window (or at all) as parameters. It is working in my context as is.

def safe_click(driver, locate_method, locate_string):
"""


Parameters
----------
driver : webdriver
    initialized browser object
locate_method : Locator
    By.something
locate_string : string
    how to find it

Returns
-------
WebElement
    returns whatever click() does.

"""
wait = WebDriverWait(driver, 15)
wait.until(EC.presence_of_element_located((locate_method, locate_string)))
driver.execute_script("arguments[0].scrollIntoView(false);",
                      driver.find_element(locate_method, locate_string))
return wait.until(EC.element_to_be_clickable((locate_method, locate_string))).click()

If you use it, then the call (which I just tested and it worked) would be:

safe_click(browser, By.XPATH, '//*[@id="formDiv"]/div/table/tbody/tr[2]/td[3]/a')

You could be using it elsewhere, too, but it does not seem like there is a need.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM