简体   繁体   English

将html table / chart元素转换为图像

[英]Convert html table/chart element to image

This is something I have tried to do during my free time. 这是我在空闲时间尝试做的事情。 However I'm not yet sure of the complexities and problems I might face. 但是我还不确定我可能面临的复杂性和问题。 I would like to go to a url like this: https://fred.stlouisfed.org/series/DFII5 and save this chart as an image , anywhere locally on my pc. 我想去一个这样的网址: https//fred.stlouisfed.org/series/DFII5 并将此图表保存为图像 ,在我的电脑本地的任何地方。

My first approach was to use either the html agility pack: 我的第一种方法是使用html敏捷包:

 var document = new HtmlWeb().Load("https://fred.stlouisfed.org/series/DFII5");
        var urls = document.DocumentNode.Descendants("img")
                                        .Select(e => e.GetAttributeValue("src", null))
                                        .Where(s => !String.IsNullOrEmpty(s));

or even use the WinForms web browser control: 甚至使用WinForms Web浏览器控件:

private void GetWebpage(string url)
    {
        WebBrowser browser = new WebBrowser();
        browser.Navigate(url);
        browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);

    }

    void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        var browser = (WebBrowser)sender;
        var client = new WebClient();
        foreach (var img in browser.Document.Images)
        {
            var image = img as HtmlElement;
            var src = image.GetAttribute("src").TrimEnd('/');
            if (!Uri.IsWellFormedUriString(src, UriKind.Absolute))
            {
                src = string.Concat(browser.Document.Url.AbsoluteUri, "/", src);
            }

            //Append any path to filename as needed
            var filename = new string(src.Skip(src.LastIndexOf('/') + 1).ToArray());
            File.WriteAllBytes(filename, client.DownloadData(src));
        }
    }

Both approaches have been able to fetch all the images from that webpage, however the chart is what I want, and it's not an image. 这两种方法都能够从该网页获取所有图像,但是图表就是我想要的,而且它不是图像。

Is this task possible? 这项任务可行吗? Would I need libraries/nugets to do this? 我需要图书馆/小工具吗? And how would I go about achieving this? 我将如何实现这一目标? Note: It's not necessary to do it in C#, it could be in Python or anything else. 注意:没有必要在C#中执行它,它可以是Python或其他任何东西。

EDIT Some further research brought these 2 to my attention: http://www.princexml.com/ and https://wkhtmltopdf.org/ 编辑一些进一步的研究引起了我的注意: http//www.princexml.com/https://wkhtmltopdf.org/

They are both html to pdf libraries from what I understood. 根据我的理解,它们都是html到pdf库。 Would it be possible to use these, to get only the html of the chart and turn it to pdf? 是否可以使用这些,只获取图表的html并将其转换为pdf?

Just a first idea. 只是第一个想法。 Yes, that graph is not an image. 是的,该图表不是图像。

So, one idea could be: Make your software do a screenshot and cut out that specific area with some image editing SDK. 所以,一个想法可能是:让你的软件做一个截图,并用一些图像编辑SDK剪掉那个特定的区域。

For loading a website and doing a screenshot, I would think of something like Selenium. 为了加载网站并进行截图,我会想到像Selenium这样的东西。 For editing the image afterwards, you could use something like ImageMagick. 要在之后编辑图像,您可以使用ImageMagick之类的东西。

Another idea could be to grab the meta data for that image from the website and draw it on your own. 另一个想法可能是从网站上获取该图像的元数据并自己绘制。

You could try HTML2PDF converter. 你可以试试HTML2PDF转换器。 See https://www.html2pdf.fr 请参阅https://www.html2pdf.fr

Or HTML2PS converter. 或者HTML2PS转换器。 See http://user.it.uu.se/%7Ejan/html2ps.html 请参阅http://user.it.uu.se/%7Ejan/html2ps.html

ImageMagick can use the latter, if installed to do the conversion. ImageMagick可以使用后者,如果安装进行转换。 See HTML section at http://www.imagemagick.org/script/formats.php 请参阅http://www.imagemagick.org/script/formats.php上的 HTML部分

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM