简体   繁体   中英

Using selenium webdriver, how to click on multiple random links in webpage one after another continuously to detect broken links?

I'm trying to write a test script that would essentially test all visible links randomly rather than explicitly specifying them, in a webpage upon login. Is this possible in Selenium IDE/Webdriver, and if so how can I do this?

links = driver.find_element_by_tag_name("a")
list = links[randint(0, len(links)-1)]

The above will fetch all links in the first page but how do I go about testing all or as many links possible without manually adding the above code for each link/page? I suppose what I'm trying to do is find broken links that would result in 500/404s . Any productive way of doing this? Thanks.

Currently, you can't get the status code legitimately from selenium. You could use selenium to crawl for urls, and other library like requests to check link's status like this (or use solution with title check proposed by @MrTi):

import requests

def find_broken_links(root, driver):
    visited = set()
    broken = set()
    # Use queue for BFS, list / stack for DFS.
    elements = [root]
    session = requests.session()

    while len(elements):
        el = elements.pop()
        if el in visited:
            continue

        visited.add(el)

        resp = session.get(el)
        if resp.status_code in [500, 404]:
            broken.add(el)
            continue

        driver.get(el)
        links = driver.find_element_by_tag_name("a")
        for link in links:
            elements.append(link.get_attribute('href'))

    return broken

When testing for a bad page, I usually test for the title/url. If you are testing a self-contained site, then you should find/create a link that is bad, and see what is unique in the title/URL, and then do something like:

assert(!driver.getTitle().contains("500 Error"));

If you don't know what the title/url will look like, you can check if the title contains "500"/"404"/"Error"/"Page not found" or if the page source contains those as well.

This will probably lead to a bunch of bad pages that aren't really bad (especially if you check for the page source), and will require you to go through each of them, and verify that they really are bad

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM