简体   繁体   中英

Xamarin.Forms (UWP) - How Can I Get a WebView's DOM as an HTML String?

In a Xamarin.Forms (UWP) project, I have a WebView control whose Source is created with an HTML string, like this:

var webview = new Xamarin.Forms.WebView
{
    Source = new HtmlWebViewSource
    {
        Html = "<html>....</html>"
    }
};

The HTML contains JavaScript that dynamically generates HTML inside the <body> . This renders perfectly on the screen. That means the WebView understands the DOM that is being created with the JavaScript. Great.

But now I need to parse through some of the generated HTML, but all I can seem to access is the original HTML string that I passed in as the Source, and not the final generated DOM.

Is there a way to convert the DOM generated by the JavaScript and understood by the WebView into a string so that I can parse (using a library like HTML Agility Pack or AngleSharp) and pull out some segments of the HTML? This can be in Xamarin.Forms or UWP (the platform I'm targeting).

NOTE: In full disclosure (in case it helps, and to avoid accusations of this being an XY problem ), I am ultimately trying to solve the problem of printing a WebView with multiple pages on UWP - research on this has been met with very sparse information. I have a solution that works for HTML that is not dynamically generated with JavaScript - basically I'm pulling out parts of the HTML that represent printable pages, and I'm adding those as separate pages for print and print preview. But as mentioned earlier, I can't seem to parse through dynamically generated content.

My first thought was to use the Eval method built into Xamarin.Forms, but then I found out this is method does not return anything so it is suitable only for app-to-webview communication.

So far the easiest way to implement this is using a custom version of the WebView control:

public class ExtendedWebView : WebView
{
    public delegate Task<string> GetHtmlRequestedHandler();

    public event GetHtmlRequestedHandler GetHtmlRequested;


    public async Task<string> GetHtmlAsync()
    {
        var handler = GetHtmlRequested;
        if (handler != null)
        {
            return await handler.Invoke();
        }
        return null;
    }
}

Now in UWP platform project create a custom renderer:

[assembly: ExportRenderer(typeof(ExtendedWebView), typeof(ExtendedWebViewRenderer))]
namespace App.UWP
{
    public class ExtendedWebViewRenderer : WebViewRenderer
    {
        protected override void OnElementChanged(ElementChangedEventArgs<WebView> e)
        {
            base.OnElementChanged(e);
            if (e.OldElement != null)
            {
                var ew = (e.OldElement as ExtendedWebView);
                ew.GetHtmlRequested -= Ew_GetHtmlRequested;
            }

            if (e.NewElement != null)
            {
                var ew = (e.NewElement as ExtendedWebView);
                ew.GetHtmlRequested += Ew_GetHtmlRequested;
            }
        }

        private async Task<string> Ew_GetHtmlRequested()
        {
            return await Control.InvokeScriptAsync("eval", new string[] { "document.documentElement.outerHTML;" });
        }
    }
}

The trick is that we are calling the JavaScript eval function that will return the HTML itself from the web view.

You just need to replace the WebView in XAML with our ExtendedWebView and call its GetHtmlAsync method whenever needed.

The only thing I dislike about this solution is that the event has Task<string> return type, which is weird. Actually already having a return type on event is unusual. A better solution would be to put a property in custom EventArgs that the native control would set with result of the operation, but because the InvokeScriptAsync method is asynchronous (and the non-asynchronous InvokeScript method is obsolete and should no longer be used) we would have to implement a custom Task that would complete when the property is set. Such approach is utilized in UWP with some events, they are using a "deferral" which says the caller that the event will finish only after some asynchronous operation finishes. I will try to look for some authoritative answer on how calling a native asynchronous operation should be implemented in case of custom views :-) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM