简体   繁体   中英

Can I extract information from linkedIn using java HtmlUnit library?

I tried hard to find a way to extract data from my LinkedIn account without using the REST API but any result :/ Anyone know if it's possible and how? When I tried this code in Eclipse the result were either a NullPointerException or null when I selected some fields from the response html page. Note that the selector path works well in the console of the browser. Thank you very much.

String url = "https://www.linkedin.com/uas/login?goback=&trk=hb_signin";
final WebClient webClient = new WebClient();
webClient.getOptions().setJavaScriptEnabled(false);
webClient.getOptions().setCssEnabled(false);
HtmlPage loginPage = webClient.getPage(url); 
final HtmlForm loginForm = loginPage.getFormByName("login");
final HtmlSubmitInput button = loginForm.getInputByName("signin");
final HtmlTextInput usernameTextField = 
loginForm.getInputByName("session_key");
final HtmlPasswordInput passwordTextField = 
loginForm.getInputByName("session_password");
usernameTextField.setValueAttribute("something@outlook.com");
passwordTextField.setValueAttribute("**************");
final HtmlPage response = button.click();
loginPage=webClient.getPage("https://www.linkedin.com/in/issa-hammoud-
0a2802114/");
System.out.println(loginPage.querySelector("#profile-wrapper > div.pv-
content.profile-view-grid.neptune-grid.two-column.ghost-animate-in > 
div.core-rail > section div > div > button > img");

Since you are making a secured connection (HTTPS) you need to specify getOptions().setUseInsecureSSL(true);

Also make sure you enable cookies getCookieManager().setCookiesEnabled(true);

Having said that you should really be using the Linkedin's REST API.

Hope that helps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM