List <HtmlElement> ips = null ;
List <HtmlElement> ports = null ;
ArrayList <String> proxies = new ArrayList();
HtmlPage page = null;
String baseUrl = "http://www.freeproxylists.net/" ;
WebClient client;
try{
client = new WebClient();
client.getOptions().setJavaScriptEnabled(false);
page = client.getPage(baseUrl);
ips = page.getByXPath("//table[@class='DataGrid']/tbody/tr/td[position()=1]/text()");
ports = page.getByXPath("//table[@class='DataGrid']/tbody/tr/td[position()=2]/text()");
for(int i=0;i<ips.size();i++){
proxies.add(ips.get(i)+":"+ports.get(i));
System.out.println(ips.get(i)+":"+ports.get(i));
}
}
catch(Exception e){
System.out.println(e);
}
sorry for my bad code indenting, anyway i'm trying to scrape proxy from the site and i got these warning:
mag 20, 2018 4:04:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error AVVERTENZA: CSS error: ' http://www.freeproxylists.net/grid.css ' [1:1] Error in rule. (Invalid token "<". Was expecting one of: , , "", ".", ":", " ", "[", , , , , , , .) mag 20, 2018 4:04:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning AVVERTENZA: CSS warning: ' http://www.freeproxylists.net/grid.css ' [1:1] Ignoring the whole rule. mag 20, 2018 4:04:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error AVVERTENZA: CSS error: ' http://www.freeproxylists.net/grid.css ' [45:1] Error in rule. (Invalid token "<". Was expecting one of: , , "", ".", ":", " ", "[", , , , , , .) mag 20, 2018 4:04:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning AVVERTENZA: CSS warning: ' http://www.freeproxylists.net/grid.css ' [45:1] Ignoring the whole rule.
How can i fix this?
You can simply replace the DefaultCssErrorHandler used by your WebClient with the SilentCssErrorHandler.
The HtmlUnit FAQ page has a short sample for this.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.