[英]Is it possible to get the HTML code from WebView
I would like to preemptively get the HTML code of a webpage that is to be loaded in a webView
, parse it using regex, and display only the HTML code that I want, while letting the webpage still think it has loaded everything.我想抢先获取要在
webView
中加载的网页的 HTML 代码,使用正则表达式对其进行解析,并仅显示我想要的 HTML 代码,同时让网页仍然认为它已经加载了所有内容。
Is there any way to do that in the WebViewClient.onLoadResource()
or similar methods?有没有办法在
WebViewClient.onLoadResource()
或类似方法中做到这一点?
EDIT: I tried this:编辑:我试过这个:
class MyJavaScriptInterface
{
@SuppressWarnings("unused")
public void showHTML(String html, Context context)
{
new AlertDialog.Builder(context)
.setTitle("HTML")
.setMessage(html)
.setPositiveButton(android.R.string.ok, null)
.setCancelable(false)
.create();
pageHTML = html;
}
}
@Override
public void customizeWebView(final ServiceCommunicableActivity activity, final WebView webview, final SearchResult mRom) {
mRom.setFileSize(getFileSize(mRom.getURLSuffix()));
webview.getSettings().setJavaScriptEnabled(true);
MyJavaScriptInterface interfaceA = new MyJavaScriptInterface();
webview.addJavascriptInterface(interfaceA, "HTMLOUT");
WebViewClient anchorWebViewClient = new WebViewClient()
{
@Override
public void onPageFinished(WebView view, String url)
{
/* This call inject JavaScript into the page which just finished loading. */
webview.loadUrl("javascript:window.HTMLOUT.showHTML('<head>'+document.getElementsByTagName('html')[0].innerHTML+'</head>');");
Pattern pattern = Pattern.compile("<h2>Winning Sc.+</h2></div>(.+)<br>", Pattern.DOTALL);
Matcher matcher = pattern.matcher(pageHTML);
matcher.find();
The interface is never called该接口永远不会被调用
Had to use HttpClient.不得不使用HttpClient。 no cookies required, just parsing for html:
不需要 cookie,只需解析 html:
private String getDownloadButtonOnly(String url){
HttpGet pageGet = new HttpGet(url);
ResponseHandler<String> handler = new ResponseHandler<String>() {
public String handleResponse(HttpResponse response) throws ClientProtocolException, IOException {
HttpEntity entity = response.getEntity();
String html;
if (entity != null) {
html = EntityUtils.toString(entity);
return html;
} else {
return null;
}
}
};
pageHTML = null;
try {
while (pageHTML==null){
pageHTML = client.execute(pageGet, handler);
}
} catch (ClientProtocolException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Pattern pattern = Pattern.compile("<h2>Direct Down.+?</h2>(</div>)*(.+?)<.+?>", Pattern.DOTALL);
Matcher matcher = pattern.matcher(pageHTML);
String displayHTML = null;
while(matcher.find()){
displayHTML = matcher.group();
}
return displayHTML;
}
@Override
public void customizeWebView(final ServiceCommunicableActivity activity, final WebView webview, final SearchResult mRom) {
mRom.setFileSize(getFileSize(mRom.getURLSuffix()));
webview.getSettings().setJavaScriptEnabled(true);
WebViewClient anchorWebViewClient = new WebViewClient()
{
@Override
public void onPageStarted(WebView view, String url, Bitmap favicon) {
super.onPageStarted(view, url, favicon);
String downloadButtonHTML = getDownloadButtonOnly(url);
if(downloadButtonHTML!=null && !url.equals(lastLoadedURL)){
lastLoadedURL = url;
webview.loadDataWithBaseURL(url, downloadButtonHTML, null, "utf-8", url);
}
}
这是从 WebView 中提取 HTML的教程,不要忘记阅读教程末尾的警告。
尝试在 public void showHTML(String html, Context context) 之前添加@JavascriptInterface
In case you have a chance to influence server part where you receive a page from, you can ask to redirect to a particular page in case of error.如果您有机会影响接收页面的服务器部分,您可以要求重定向到特定页面以防出错。 In your WebViewClient you can detect this redirect and use it a signal of error.
在您的 WebViewClient 中,您可以检测到此重定向并将其用作错误信号。
are there any suggestions if this works with ajax loaded pages too?如果这也适用于ajax加载的页面,有什么建议吗?
my web- application loads the data via ajax after user login.我的网络应用程序在用户登录后通过 ajax 加载数据。
i want to get the content (parts of the html content) and pass eg a inner frame html to the printer or at least into an android textbox for further processing this content.我想获取内容(部分 html 内容)并将例如内部框架 html 传递到打印机或至少传递到 android 文本框以进一步处理此内容。 greetings hannes
问候汉尼斯
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.