[英]Exception in thread "main" java.net.MalformedURLException: no protocol error while finding broken links in a page using Selenium and Java
I am trying to find the broken link in a page through Selenium(Java) code but I am facing this issue.我试图通过 Selenium(Java) 代码在页面中找到断开的链接,但我正面临这个问题。 I am not able to run this code due to the below exception.
由于以下异常,我无法运行此代码。 In this code, the total number of links in a page is found then the URL of links is found.
在此代码中,找到页面中的链接总数,然后找到链接的 URL。 Please see the issue and give me the resolution for this.
请查看问题并给我解决方案。
Exception in thread "main" java.net.MalformedURLException: no protocol:
at java.net.URL.<init>(Unknown Source)
at java.net.URL.<init>(Unknown Source)
at java.net.URL.<init>(Unknown Source)
at fire.Weil.main(Weil.java:57)
My code is: -我的代码是:-
package fire;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.TimeUnit;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
public class Weil {
public static void main(String[] args) throws MalformedURLException, IOException{
System.setProperty("webdriver.gecko.driver", "C:\\Users\\sumitk\\Downloads\\Selenium Drivers\\Gecodriver\\geckodriver.exe");
WebDriver driver = new FirefoxDriver();
//delete all cookies
driver.manage().deleteAllCookies();
//dynamic wait
driver.manage().timeouts().pageLoadTimeout(30, TimeUnit.SECONDS);
driver.manage().timeouts().implicitlyWait(5, TimeUnit.SECONDS);
//open site
driver.get("https://www.weil.com/");
//1. get the list of all the links and images
List<WebElement> linklist = driver.findElements(By.tagName("a"));
linklist.addAll(driver.findElements(By.tagName("img")));
System.out.println("Size of full links and images--->"+ linklist.size());
List<WebElement> activeLinks =new ArrayList<WebElement>();
// 2. iterate linklist : exclude all the links/images does not have any href attribute
for(int i=0; i<linklist.size(); i++)
{
System.out.println(linklist.get(i).getAttribute("href"));
if(linklist.get(i).getAttribute("href") !=null)
{
activeLinks.add(linklist.get(i));
}
}
//get the size of active links list.
System.out.println("Size of active links and images--->"+ activeLinks.size());
//3. check the href url, with httpconnection api.
for(int j=0; j<activeLinks.size(); j++)
{
HttpURLConnection connection=(HttpURLConnection) new URL(activeLinks.get(j).getAttribute("href")).openConnection();
connection.connect();
String response=connection.getResponseMessage();
connection.disconnect();
System.out.println(activeLinks.get(j).getAttribute("href") +" --->"+response);
}
}
}
This error message...这个错误信息...
Exception in thread "main" java.net.MalformedURLException: no protocol:
...implies that your program was trying to access an URL
which doesn't have a protocol ie HTTP
or HTTPS
is absent. ...暗示您的程序试图访问一个没有协议的
URL
,即没有HTTP
或HTTPS
。
Your logic was near perfect.你的逻辑近乎完美。 A few words:
几句话:
It may be possible that some of the <a>
elements within the webpage https://www.weil.com/ have href
attribute have no value assigned.网页https://www.weil.com/中的某些
<a>
元素可能具有href
属性没有分配值。 As an example:举个例子:
<a class="canvas-button ss-icon" href="">?</a>
<a class="search-button ss-icon" href="">Search</a>
That is the reason this line:这就是这条线的原因:
System.out.println("Size of active links and images--->"+ activeLinks.size()); //prints: Size of active links and images--->72
But if you print the href
attribute:但是如果你打印
href
属性:
for(int i=0; i<activeLinks.size(); i++) System.out.println(activeLinks.get(i).getAttribute("href"));
The first two lines are blank as follows:前两行空白如下:
<blank> <blank> https://www.weil.com/ https://www.weil.com/ https://www.weil.com/people
I made a couple of simple tweaks in your code as follows:我在你的代码中做了一些简单的调整,如下所示:
findElements(By.tagName("a"))
with findElements(By.xpath("//a[contains (@href, 'weil')]"))
findElements(By.tagName("a"))
替换为findElements(By.xpath("//a[contains (@href, 'weil')]"))
findElements(By.tagName("img"))
with findElements(By.xpath("//img[contains (@src, 'weil')]"))
findElements(By.tagName("img"))
替换为findElements(By.xpath("//img[contains (@src, 'weil')]"))
Here is the execution result:下面是执行结果:
Code Block:代码块:
public class A_Chrome_Demo { public static void main(String[] args) throws IOException { System.setProperty("webdriver.chrome.driver", "C:\\\\Utility\\\\BrowserDrivers\\\\chromedriver.exe"); ChromeOptions options = new ChromeOptions(); options.addArguments("start-maximized"); options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation")); options.setExperimentalOption("useAutomationExtension", false); WebDriver driver = new ChromeDriver(options); driver.get("https://www.weil.com/"); List<WebElement> linklist = driver.findElements(By.xpath("//a[contains (@href, 'weil')]")); linklist.addAll(driver.findElements(By.xpath("//img[contains (@src, 'weil')]"))); System.out.println("Size of full links and images--->"+ linklist.size()); List<WebElement> activeLinks =new ArrayList<WebElement>(); for(int i=0; i<linklist.size(); i++) { System.out.println(linklist.get(i).getAttribute("href")); if(linklist.get(i).getAttribute("href") !=null) activeLinks.add(linklist.get(i)); } System.out.println("Size of active links and images--->"+ activeLinks.size()); for(int j=0; j<activeLinks.size(); j++) { HttpURLConnection connection=(HttpURLConnection) new URL(activeLinks.get(j).getAttribute("href")).openConnection(); connection.connect(); String response=connection.getResponseMessage(); connection.disconnect(); System.out.println(activeLinks.get(j).getAttribute("href") +" --->"+response); } } }
Console Output:控制台输出:
Size of full links and images--->46 https://www.weil.com/about-weil https://extranet.weil.com/ https://login.weil.com/ https://www.weil.com/articles/weil-elects-16-new-partners-and-announces-new-counsel-class-2019 https://www.weil.com/articles/weil-announces-weil-legal-innovators-program https://www.weil.com/articles/weil-partners-receive-top-honors-in-2019 https://www.weil.com/articles/two-weil-partners-named-among-turnarounds-workouts-outstanding-restructuring-lawyers-for-2019 https://careers.weil.com/ https://www.weil.com/articles/weil-wins-five-2019-law360-practice-group-of-the-year-awards https://www.weil.com/articles/weil-earns-2020-litigation-department-of-the-year-honorable-mention-from-the-american-lawyer https://www.weil.com/articles/weil-leads-three-of-the-five-top-bankruptcy-cases-of-2019 https://www.weil.com/about-weil/about-weil-prominent-matters https://www.weil.com/articles/weil-represented-french-state-in-landmark-privatization-and-ipo-of-francaise-des-jeux https://www.weil.com/articles/weil-litigators-clinch-four-win-week-showcasing-cross-departmental-strengths https://www.weil.com/articles/weil-advised-guggenheim-securities-and-morgan-stanley-on-jack-in-the-boxs-1-3b-securitization https://www.weil.com/about-weil/not-for-profit https://www.weil.com/articles/weil-secures-asylum-for-burkina-faso-native-escaping-persecution https://www.weil.com/articles/weils-2019-pro-bono-annual-review-our-finest-hours https://www.weil.com/articles/weil-and-nysba-task-force-deliver-report-on-wrongful-convictions-in-new-york-state https://www.weil.com/about-weil/diversity-and-inclusion https://www.weil.com/articles/weil-named-a-2020-best-place-to-work-for-lgbtq-equality https://www.weil.com/articles/three-weil-partners-named-best-practitioners-in-their-fields http://business-finance-restructuring.weil.com/ http://eurorestructuring.weil.com/ http://privateequity.weil.com/ http://governance.weil.com/ http://product-liability.weil.com/ https://tax.weil.com/ https://tax.weil.com/ https://tax.weil.com/ https://tax.weil.com/ https://tax.weil.com/ https://tax.weil.com/ https://tax.weil.com/ https://tax.weil.com/latest-thinking/cryptoassets-hmrc-uk-tax-net-widens/ http://business-finance-restructuring.weil.com/automatic-stay/denial-of-stay-relief-is-a-final-order-says-the-us-supreme-court/ http://business-finance-restructuring.weil.com/news/weil-wins-five-2019-law360-practice-group-of-the-year-awards/ https://www.weil.com/about-weil/green-policy https://www.weil.com/about-weil/sitemap https://www.weil.com/about-weil/privacy-policy https://www.weil.com/about-weil/privacy-shield-notice https://www.weil.com/about-weil/regulatory-information https://www.weil.com/about-weil/disclaimer null null null Size of active links and images--->43 https://www.weil.com/about-weil --->OK https://extranet.weil.com/ --->OK https://login.weil.com/ --->OK https://www.weil.com/articles/weil-elects-16-new-partners-and-announces-new-counsel-class-2019 --->OK https://www.weil.com/articles/weil-announces-weil-legal-innovators-program --->OK https://www.weil.com/articles/weil-partners-receive-top-honors-in-2019 --->OK https://www.weil.com/articles/two-weil-partners-named-among-turnarounds-workouts-outstanding-restructuring-lawyers-for-2019 --->OK https://careers.weil.com/ --->OK https://www.weil.com/articles/weil-wins-five-2019-law360-practice-group-of-the-year-awards --->OK https://www.weil.com/articles/weil-earns-2020-litigation-department-of-the-year-honorable-mention-from-the-american-lawyer --->OK https://www.weil.com/articles/weil-leads-three-of-the-five-top-bankruptcy-cases-of-2019 --->OK https://www.weil.com/about-weil/about-weil-prominent-matters --->OK https://www.weil.com/articles/weil-represented-french-state-in-landmark-privatization-and-ipo-of-francaise-des-jeux --->OK https://www.weil.com/articles/weil-litigators-clinch-four-win-week-showcasing-cross-departmental-strengths --->OK https://www.weil.com/articles/weil-advised-guggenheim-securities-and-morgan-stanley-on-jack-in-the-boxs-1-3b-securitization --->OK https://www.weil.com/about-weil/not-for-profit --->OK https://www.weil.com/articles/weil-secures-asylum-for-burkina-faso-native-escaping-persecution --->OK https://www.weil.com/articles/weils-2019-pro-bono-annual-review-our-finest-hours --->OK https://www.weil.com/articles/weil-and-nysba-task-force-deliver-report-on-wrongful-convictions-in-new-york-state --->OK https://www.weil.com/about-weil/diversity-and-inclusion --->OK https://www.weil.com/articles/weil-named-a-2020-best-place-to-work-for-lgbtq-equality --->OK https://www.weil.com/articles/three-weil-partners-named-best-practitioners-in-their-fields --->OK http://business-finance-restructuring.weil.com/ --->Forbidden http://eurorestructuring.weil.com/ --->Forbidden http://privateequity.weil.com/ --->Forbidden http://governance.weil.com/ --->Forbidden http://product-liability.weil.com/ --->Forbidden https://tax.weil.com/ --->Forbidden https://tax.weil.com/ --->Forbidden https://tax.weil.com/ --->Forbidden https://tax.weil.com/ --->Forbidden https://tax.weil.com/ --->Forbidden https://tax.weil.com/ --->Forbidden https://tax.weil.com/ --->Forbidden https://tax.weil.com/latest-thinking/cryptoassets-hmrc-uk-tax-net-widens/ --->Forbidden http://business-finance-restructuring.weil.com/automatic-stay/denial-of-stay-relief-is-a-final-order-says-the-us-supreme-court/ --->Forbidden http://business-finance-restructuring.weil.com/news/weil-wins-five-2019-law360-practice-group-of-the-year-awards/ --->Forbidden https://www.weil.com/about-weil/green-policy --->OK https://www.weil.com/about-weil/sitemap --->OK https://www.weil.com/about-weil/privacy-policy --->OK https://www.weil.com/about-weil/privacy-shield-notice --->OK https://www.weil.com/about-weil/regulatory-information --->OK https://www.weil.com/about-weil/disclaimer --->OK
You can find a relevant detailed discussion in:您可以在以下位置找到相关的详细讨论:
This is because the Web Page contains 'a' Tag Elements with no reference to the href keyword.这是因为网页包含没有引用 href 关键字的“a”标记元素。
ie Top Left-most List-Drawer Icon and Search Icon.即最左上角的列表抽屉图标和搜索图标。
Refer the attached image.请参阅附图。
Using a try catch block for java.net.MalformedURLException could possibly help you out and would allow you to move ahead with the desired flow.对 java.net.MalformedURLException 使用 try catch 块可能会帮助您解决问题,并允许您继续进行所需的流程。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.