简体   繁体   English

如何使用带有 Java 的 Selenium WebDriver 查找损坏的链接

[英]How to find broken links using Selenium WebDriver with Java

I want to verify broken links on a website and I am using this code:我想验证网站上的断开链接,我正在使用以下代码:

 public static int invalidLink;
    String currentLink;
    String temp;

    public static void main(String[] args) throws IOException {
        // Launch The Browser
        WebDriver driver = new FirefoxDriver();
        // Enter URL
        driver.get("http://www.applicoinc.com");

        // Get all the links URL
        List<WebElement> ele = driver.findElements(By.tagName("a"));
        System.out.println("size:" + ele.size());
        boolean isValid = false;
        for (int i = 0; i < ele.size(); i++) {

            isValid = getResponseCode(ele.get(i).getAttribute("href"));
            if (isValid) {
                System.out.println("ValidLinks:" + ele.get(i).getAttribute("href"));
                driver.get(ele.get(i).getAttribute("href"));
                List<WebElement> ele1 = driver.findElements(By.tagName("a"));
                System.out.println("InsideSize:" + ele1.size());
                for (int j=0; j<ele1.size(); j++){
                    isValid = getResponseCode(ele.get(j).getAttribute("href"));
                    if (isValid) {
                        System.out.println("ValidLinks:" + ele.get(j).getAttribute("href"));
                    }
                    else{
                        System.out.println("InvalidLinks:"+ ele.get(j).getAttribute("href"));
                    }
                }

                } else {
                    System.out.println("InvalidLinks:"
                            + ele.get(i).getAttribute("href"));
                }

            }
        }
    }


    public static boolean getResponseCode(String urlString) {
        boolean isValid = false;
        try {
            URL u = new URL(urlString);
            HttpURLConnection h = (HttpURLConnection) u.openConnection();
            h.setRequestMethod("GET");
            h.connect();
            System.out.println(h.getResponseCode());
            if (h.getResponseCode() != 404) {
                isValid = true;
            }
        } catch (Exception e) {

        }
        return isValid;
    }

}

I would keep it returning an int, and just have the MalformedURLException be a special case, returning -1.我会保持它返回一个int,并且只将MalformedURLException 作为一个特例,返回-1。

public static int getResponseCode(String urlString) {
    try {
        URL u = new URL(urlString);
        HttpURLConnection h =  (HttpURLConnection)  u.openConnection();
        h.setRequestMethod("GET");
        h.connect();
        return h.getResponseCode();

    } catch (MalformedURLException e) {
        return -1;
    }
}

EDIT: It seems you're sticking with the boolean approach, as I said before this has it's limitations but should work ok for demonstartion purposes.编辑:似乎您坚持使用布尔方法,正如我之前所说,这有其局限性,但对于演示目的应该可以正常工作。

There is no reason to find all elements a second time taking the approach you have.没有理由第二次使用您拥有的方法找到所有元素。 Try this:尝试这个:

// Get all the links
List<WebElement> ele = driver.findElements(By.tagName("a"));
System.out.println("size:" + ele.size());
boolean isValid = false;
for (int i = 0; i < ele.size(); i++) {
    string nextHref = ele.get(i).getAttribute("href");
    isValid = getResponseCode(nextHref);
    if (isValid) {
        System.out.println("Valid Link:" + nextHref);

    }
    else {
        System.out.println("INVALID Link:" + nextHref);

    }
}

This is untested code, so if it does not work, please provide more detail than just saying 'it doesn't work', provide output & any stack traces/error messages if possible.这是未经测试的代码,因此如果它不起作用,请提供更多详细信息而不仅仅是说“它不起作用”,如果可能,请提供输出和任何堆栈跟踪/错误消息。 Cheers干杯

It seems, that some of your href attribute contains expressions which are not identified as url's.看来,您的某些 href 属性包含未标识为 url 的表达式。 What comes immediately to mind is to use the try catch block to identify such url's.立即想到的是使用 try catch 块来识别此类 url。 Try the following piece of code.试试下面的一段代码。

package com.automation.test;

import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.List;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class Test {
    public static int invalidLink;
    String currentLink;
    String temp;

    public static void main(String[] args) throws IOException {
        // Launch The Browser
        WebDriver driver = new FirefoxDriver();
        // Enter Url
        driver.get("file:///home/sighil/Desktop/file");

        // Get all the links url
        List<WebElement> ele = driver.findElements(By.tagName("a"));
        System.out.println("size:" + ele.size());
        boolean isValid = false;
        for (int i = 0; i < ele.size(); i++) {
            // System.out.println(ele.get(i).getAttribute("href"));
            isValid = getResponseCode(ele.get(i).getAttribute("href"));
            if (isValid) {
                System.out.println("ValidLinks:"
                        + ele.get(i).getAttribute("href"));
            } else {
                System.out.println("InvalidLinks:"
                        + ele.get(i).getAttribute("href"));
            }
        }

    }

    public static boolean getResponseCode(String urlString) {
        boolean isValid = false;
        try {
            URL u = new URL(urlString);
            HttpURLConnection h = (HttpURLConnection) u.openConnection();
            h.setRequestMethod("GET");
            h.connect();
            System.out.println(h.getResponseCode());
            if (h.getResponseCode() != 404) {
                isValid = true;
            }
        } catch (Exception e) {

        }
        return isValid;
    }

}

I have modified getResponseCode to return boolean values based on whether the url is valid(true) or invalid(false).我已经修改了 getResponseCode 以根据 url 是有效(真)还是无效(假)返回布尔值。

Hope this helps you.希望这对你有帮助。

You can try below code.你可以试试下面的代码。

public static void main(String[] args) {

WebDriver driver = new FirefoxDriver();

List<String> brokenLinks = getBrokenURLs(driver, "http://mayurshah.in", 2, new ArrayList<String>());
for(String brokenLink : brokenLinks){
System.out.println(brokenLink);
}


}
public static List<String> getBrokenURLs(WebDriver driver, String appURL, int depth, List<String> links){
{
driver.navigate().to(appURL);
System.out.println("Depth is: " + depth);
while(depth > 0){
List<WebElement> linkElems = driver.findElements(By.tagName("a"));
for(WebElement linkElement : linkElems)
if(!links.contains(linkElement))
links.add(linkElement.getAttribute("href"));
for(String link : links)
getBrokenURLs(driver, link, --depth, links);
}
}
return getBrokenURLs(driver, links, new ArrayList<String>()) ;
}

public static List<String> getBrokenURLs(WebDriver driver, List<String> links, List<String> brokenLinks){
{
for(String link : brokenLinks){
driver.navigate().to(link);
if(driver.getTitle().contains("404 Page Not Found")){
brokenLinks.add(link);
}
}
}
return brokenLinks ;
}

In above code, I am first getting list of URLs from the first page.在上面的代码中,我首先从第一页获取 URL 列表。 Now I am navigating to the first link of the IInd page and getting all URLs, this way I will keep on storing all URL by going to each page one by one, till depth is mentioned.现在我导航到 IInd 页面的第一个链接并获取所有 URL,这样我将通过逐页访问每个页面来继续存储所有 URL,直到提到深度。

After collecting all URLs, I will verify validity of each URL one by one and return List of URLs with 404 page.收集所有URL后,我会一一验证每个URL的有效性,并返回带有404页面的URL列表。

Hope that helps!希望有帮助!

src: https://softwaretestingboard.com/qna/1380/how-to-find-broken-links-images-from-page-using-webdriver#axzz4wM3UEZtq源代码: https : //softwaretestingboard.com/qna/1380/how-to-find-broken-links-images-from-page-using-webdriver#axzz4wM3UEZtq

In web application we have to verify all the links whether they are broken means the after clicking on link 'page not found' page displays.在 Web 应用程序中,我们必须验证所有链接是否已损坏,这意味着单击链接后会显示“找不到页面”页面。 Below is the code:下面是代码:

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver; 
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class VerifyLinks {
    public static void main(String[] args) {
        WebDriver driver = new FirefoxDriver(); 
        driver.manage().window().maximize(); 
        driver.get("https://www.google.co.in");
        List< WebElement > allLink = driver.findElements(By.tagName("a")); 
        System.out.println("Total links are " + allLink.size());
        for (int i = 0; i < allLink.size(); i++) {
        WebElement ele = allLink.get(i); 
        String url = ele.getAttribute("href"); 
        verifyLinkActive(url);
    }
}
    public static void verifyLinkActive(String linkurl) {
        try {
           URL url = new URL(linkurl);
           HttpURLConnection httpUrlConnect = (HttpURLConnection) url.openConnection(); 
           httpUrlConnect.setConnectTimeout(3000); 
           httpUrlConnect.connect();
           if (httpUrlConnect.getResponseCode() == 200) {
              System.out.println(linkurl + " - " + httpUrlConnect.getResponseMessage());
           }
              if (httpUrlConnect.getResponseCode() == HttpURLConnection.HTTP_NOT_FOUND) {
                  System.out.println(linkurl + " - " + httpUrlConnect.getResponseMessage() 
                                     + " - " + HttpURLConnection.HTTP_NOT_FOUND);
              }
       }
       catch (Exception e) {
       }
   }
}

For more tutorial visit https://www.jbktutorials.com/selenium更多教程请访问https://www.jbktutorials.com/selenium

Steps:脚步:
1. Open a browser and navigate to TestURL 1. 打开浏览器并导航到TestURL
2. Grab all the links from the entire page 2. 抓取整个页面的所有链接
3. Check HTTP status code for all the links grabbed in step 2 (status 200 is OK, others are broken) 3.检查第2步抓取的所有链接的HTTP状态码(状态200可以,其他都坏了)
Selenium WebDriver Java code: Selenium WebDriver Java 代码:

WebDriver driver = new FirefoxDriver();
driver.get("<TestURL>");
List<WebElement> total_links = driver.findElements(By.tagName("a"));
System.out.println("Total Number of links: " + total_links.size());
for(int i = 0; i < total_links.size(); i++){
String url = total_links.get(i).getAttribute("href");
int resp_Code = 0;
try{
HttpResponse urlresp = new DefaultHttpClient().execute(new HttpGet(url));
resp_Code = urlresp.getStatusLine().getStatusCode();
}catch(Exception e){
}
if(resp_Code >= 400){
System.out.println(url + " is a broken link");
}
else{
System.out.println(url + " is a valid link");
}
}
  //allHref -for count of actual active links==after if statement filter
List<WebElement> allHref = new ArrayList<WebElement>();
    List<WebElement> linklist = driver.findElements(By.tagName("a"));

    for (int i = 0; i < linklist.size(); i++) {
        if (linklist.get(i).getAttribute("href").contains("https:")
                && linklist.get(i).getAttribute("href") != null) {
            System.out.println(linklist.get(i).getAttribute("href"));
            
            HttpURLConnection connection = (HttpURLConnection) new URL(linklist.get(i).getAttribute("href"))
                    .openConnection();
            connection.connect();
            String response = connection.getResponseMessage();
            connection.disconnect();
            System.out.println(linklist.get(i).getAttribute("href") + "R=e=s=p=o=n=s=e=>" + response);
            allHref.add(linklist.get(i));
        }
        
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM