简体   繁体   English

Web浏览器显示正确的值,但是当我使用Jsoup时,HTML没有值

[英]The Web Browser show the correct values but when I use Jsoup the HTML doesn't have the values

I'm trying to get some values from a site but these values only appears when I use a Browser, like Mozilla. 我正在尝试从网站获取一些值,但是这些值仅在使用Mozilla这样的浏览器时出现。 When I use the Jsoup I can get the HTML from the site but without values, only with the tags. 当我使用Jsoup时,我可以从网站获取HTML,但是没有值,只有标签。

This is the site I'm trying to parse: 这是我要解析的网站:

http://www.submarinoviagens.com.br/Passagens/selecionarvoo?Origem=nat&Destino=mia&Data=05/11/2012&Hora=&Origem=mia&Destino=nat&Data=09/11/2012&Hora=&NumADT=1&NumCHD=0&NumINF=0&SomenteDireto=0&Cia=&SelCabin=&utm_source=&utm_medium=&utm_campaign=&CPId= http://www.submarinoviagens.com.br/Passagens/selecionarvoo?Origem=nat&Destino=mia&Data=05/11/2012&Hora=&Origem=mia&Destino=nat&Data=09/11/2012&Hora=&NumADT=1&NumCHD=0&NumINF=0&SomenteDto &SelCabin =&utm_source =&utm_medium =&utm_campaign =&CPId =

I'm trying to get the values that appears inside these span tags: 我正在尝试获取出现在这些span标记内的值:

If I access the previous URL from a web browser I can see the following values: '', 'R$ 2634,22' and 'R$ 2634,22', but when I use the following code the values disapears. 如果我从Web浏览器访问以前的URL,则可以看到以下值:'',“ R $ 2634,22”和“ R $ 2634,22”,但是当我使用以下代码时,这些值会消失。

URL url = new URL("http://www.submarinoviagens.com.br/Passagens/selecionarvoo?Origem=nat&Destino=mia&Data=05/11/2012&Hora=&Origem=mia&Destino=nat"+
            "&Data=09/11/2012&Hora=&NumADT=1&NumCHD=0&NumINF=0&SomenteDireto=0&Cia=&SelCabin=&utm_source=&utm_medium=&utm_campaign=&CPId=");
Document doc =  Jsoup.parse(url, 100000);
String title = doc.title(); 
System.out.println(doc.toString());

If I try to see the source code via Mozilla Firefox the values disapears too. 如果我尝试通过Mozilla Firefox查看源代码,那么这些值也会消失。 But If I use the firebug plugin I can see them. 但是,如果我使用firebug插件,我可以看到它们。

Thank's for the help! 谢谢您的帮助!

The website uses JavaScript to populate all of the values you are trying to parse. 该网站使用JavaScript来填充您尝试解析的所有值。 You will have to use a library that can compute the javascript within the page. 您将必须使用可以在页面内计算javascript的库。 Not sure if there is one though. 不确定是否有一个。

anyone else? 还有谁?

Htmlunit是一种无头浏览器,可呈现Javascript,并且应该能够正确显示此页面。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM