简体   繁体   English

是否可以从 WebView 获取 HTML 代码

[英]Is it possible to get the HTML code from WebView

I would like to preemptively get the HTML code of a webpage that is to be loaded in a webView , parse it using regex, and display only the HTML code that I want, while letting the webpage still think it has loaded everything.我想抢先获取要在webView中加载的网页的 HTML 代码,使用正则表达式对其进行解析,并仅显示我想要的 HTML 代码,同时让网页仍然认为它已经加载了所有内容。

Is there any way to do that in the WebViewClient.onLoadResource() or similar methods?有没有办法在WebViewClient.onLoadResource()或类似方法中做到这一点?

EDIT: I tried this:编辑:我试过这个:

class MyJavaScriptInterface  
 {  
      @SuppressWarnings("unused")  
         public void showHTML(String html, Context context)  
         {  
            new AlertDialog.Builder(context)  
                 .setTitle("HTML")  
                 .setMessage(html)  
                 .setPositiveButton(android.R.string.ok, null)  
             .setCancelable(false)  
             .create();  
               pageHTML = html;
         }  
 }

@Override
    public void customizeWebView(final ServiceCommunicableActivity activity, final WebView webview, final SearchResult mRom) {
        mRom.setFileSize(getFileSize(mRom.getURLSuffix()));
        webview.getSettings().setJavaScriptEnabled(true);
        MyJavaScriptInterface interfaceA = new MyJavaScriptInterface();
        webview.addJavascriptInterface(interfaceA, "HTMLOUT");  
        WebViewClient anchorWebViewClient = new WebViewClient()
        {
            @Override  
            public void onPageFinished(WebView view, String url)  
            {  
                /* This call inject JavaScript into the page which just finished loading. */  
                webview.loadUrl("javascript:window.HTMLOUT.showHTML('<head>'+document.getElementsByTagName('html')[0].innerHTML+'</head>');");
                Pattern pattern = Pattern.compile("<h2>Winning Sc.+</h2></div>(.+)<br>", Pattern.DOTALL);
                Matcher matcher = pattern.matcher(pageHTML);
                matcher.find();

The interface is never called该接口永远不会被调用

Had to use HttpClient.不得不使用HttpClient。 no cookies required, just parsing for html:不需要 cookie,只需解析 html:

private String getDownloadButtonOnly(String url){
    HttpGet pageGet = new HttpGet(url);

    ResponseHandler<String> handler = new ResponseHandler<String>() {
        public String handleResponse(HttpResponse response) throws ClientProtocolException, IOException {
            HttpEntity entity = response.getEntity();
            String html; 

            if (entity != null) {
                html = EntityUtils.toString(entity);
                return html;
            } else {
                return null;
            }
        }
    };

    pageHTML = null;
    try {
        while (pageHTML==null){
            pageHTML = client.execute(pageGet, handler);
        }
    } catch (ClientProtocolException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

        Pattern pattern = Pattern.compile("<h2>Direct Down.+?</h2>(</div>)*(.+?)<.+?>", Pattern.DOTALL);
        Matcher matcher = pattern.matcher(pageHTML);
        String displayHTML = null;
        while(matcher.find()){
            displayHTML = matcher.group();
        }

    return displayHTML;
}

    @Override
    public void customizeWebView(final ServiceCommunicableActivity activity, final WebView webview, final SearchResult mRom) {
        mRom.setFileSize(getFileSize(mRom.getURLSuffix()));
        webview.getSettings().setJavaScriptEnabled(true);
        WebViewClient anchorWebViewClient = new WebViewClient()
        {

            @Override
            public void onPageStarted(WebView view, String url, Bitmap favicon) {
                super.onPageStarted(view, url, favicon);
                String downloadButtonHTML = getDownloadButtonOnly(url);
                if(downloadButtonHTML!=null && !url.equals(lastLoadedURL)){
                    lastLoadedURL = url;
                    webview.loadDataWithBaseURL(url, downloadButtonHTML, null, "utf-8", url);
                }
            }

这是从 WebView 中提取 HTML的教程,不要忘记阅读教程末尾的警告。

尝试在 public void showHTML(String html, Context context) 之前添加@JavascriptInterface

In case you have a chance to influence server part where you receive a page from, you can ask to redirect to a particular page in case of error.如果您有机会影响接收页面的服务器部分,您可以要求重定向到特定页面以防出错。 In your WebViewClient you can detect this redirect and use it a signal of error.在您的 WebViewClient 中,您可以检测到此重定向并将其用作错误信号。

are there any suggestions if this works with ajax loaded pages too?如果这也适用于ajax加载的页面,有什么建议吗?

my web- application loads the data via ajax after user login.我的网络应用程序在用户登录后通过 ajax 加载数据。

i want to get the content (parts of the html content) and pass eg a inner frame html to the printer or at least into an android textbox for further processing this content.我想获取内容(部分 html 内容)并将例如内部框架 html 传递到打印机或至少传递到 android 文本框以进一步处理此内容。 greetings hannes问候汉尼斯

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM