简体   繁体   中英

Jsoup Java parser: cannot get all content HTML from website

I try to get all tags HTML in website https://launch.stellar.org/#/login .

But my result do not have any input tags like when i see this website in F12 tool in firefox.

I do not understand why and what's solution for this problem?

Here my code:

import java.io.BufferedReader; import java.io.DataOutputStream; import java.io.InputStreamReader; import java.io.UnsupportedEncodingException; import java.net.CookieHandler; import java.net.CookieManager; import java.net.URL; import java.net.URLEncoder; import java.util.ArrayList; import java.util.List; import javax.net.ssl.HttpsURLConnection; import org.jsoup.Connection; import org.jsoup.Jsoup; import org.jsoup.helper.HttpConnection.Response; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; public class HttpUrlConnect { private HttpsURLConnection conn; private final String USER_AGENT = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36"; public static void main(String[] args) throws Exception { String url = "https://launch.stellar.org/#/login"; HttpUrlConnect http = new HttpUrlConnect(); // 1. Send a "GET" request, so that you can extract the form's data. String page = http.GetPageContent(url); Document doc = Jsoup.parse(page); System.out.println(doc); } String GetPageContent(String url) throws Exception { URL obj = new URL(url); conn = (HttpsURLConnection) obj.openConnection(); // default is GET conn.setRequestMethod("GET"); conn.setUseCaches(false); // act like a browser conn.setRequestProperty("Host", "wallet.stellar.org"); conn.setRequestProperty("User-Agent", USER_AGENT); conn.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"); conn.setRequestProperty("Accept-Language", "vi-VN,vi;q=0.8,fr-FR;q=0.6,fr;q=0.4,en- US;q=0.2,en;q=0.2"); int responseCode = conn.getResponseCode(); System.out.println("\nSending 'GET' request to URL : " + url); System.out.println("Response Code : " + responseCode); BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream())); String inputLine; StringBuffer response = new StringBuffer(); while ((inputLine = in.readLine()) != null) { response.append(inputLine); } in.close(); return response.toString(); }

I download jsoup library here: http://jsoup.org/download

But my result do not have any input tags like when i see this website in F12 tool in firefox

The "F12 Tool" (Inspector/Firebug) lets you see the source with all the modifications that javascript does to the page when your client (Firefox) opens it.

In fact, if you try to see the source received from the server ( CTRL U ) you will see there is no input element in the page.

The code you see is all generated by javascript, so you'll need a tool to interpret javascript code and give you the resulting HTML code.


In fact, JSoup is just an HTML parser. To achieve that, you'll need to switch to Selenium or HTMLUnit

Text coming from server side is same as the output you got from jsoup

After page is loaded inside web browser input tags are dynamically created using javascript so only you are unable to see the input tags.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM