[英]How to extract google translated text from google translate website using jsoup in android
I am using jsoup to retrieve data from webpage in android. 我正在使用jsoup从android中的网页检索数据。 Here I am using this URL " https://translate.google.com/#hi/en/bharat%20mera%20desh%20hai " to translate "bharat mera dekh hai " to "India is my country".
在这里,我使用此URL“ https://translate.google.com/#hi/en/bharat%20mera%20desh%20hai ”将“ bharat mera dekh hai”翻译为“印度是我的国家”。 I want to get English translated text as output but I am unable to get this.
我想将英语翻译的文本作为输出,但是我无法获得。
here is my code for extracting English text: 这是我提取英语文本的代码:
@Override
protected Void doInBackground(Void... params) {
try {
// Connect to the web site
Document document = Jsoup.connect(url).get();
Elements englishText = document.select("span#result_box");
EngText =englishText.text() ;
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
This is HTML contents: 这是HTML内容:
<span id="result_box" class="short_text" lang="en">
<span class="" contenteditable="false" tabindex="-1">
India is my country
</span>
</span>
but I am getting empty string value in EngText variable. 但是我在EngText变量中得到空字符串值。 However I am able to retrieve other static text from website but unable to get English translated text.
但是,我能够从网站检索其他静态文本,但无法获得英语翻译的文本。
The value you are trying to get is not part of initial html, but is set by javascript after page is loaded. 您尝试获取的值不是初始html的一部分,而是由javascript在页面加载后设置的。 You can check it by disabling javascript in your browser.
您可以通过在浏览器中禁用javascript进行检查。
Jsoup only gets static html, does not execute javascript code. Jsoup仅获取静态html,不执行javascript代码。
To get what you want you should consider using tool like HtmlUnit or Selenium . 为了获得想要的东西,您应该考虑使用HtmlUnit或Selenium之类的工具。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.