简体   繁体   English

如何使用C#中的内置Web浏览器保存完整的网页

[英]How to save a complete webpage using the built-in webbrowser in c#

Overall I am trying to write out a webpage to PDF. 总的来说,我正在尝试将网页写成PDF。 There is a web service that I can use to convert a file to pdf. 我可以使用一个Web服务将文件转换为pdf。 So what I am trying to do is save out a webpage from the WebBrowser winforms control. 所以我想做的是从WebBrowser winforms控件中保存一个网页。

I have already tried writing it out the document stream but that just gives me the html of the page and not the images that are used with it. 我已经尝试将其写出文档流,但这只是给我页面的html,而不是与之一起使用的图像。

Another way that I looked into, but have not been successful with, is trying to create an image of the WebBrowser document. 我研究但尚未成功的另一种方法是尝试创建WebBrowser文档的图像。 I found some examples on the web that utilize the DrawToBitmap function but none of them have worked for me. 我在网络上发现了一些利用DrawToBitmap函数的示例,但没有一个对我有用。

Any assistance would be grateful. 任何帮助将不胜感激。

You can take screenshots until you have the entire page using the Graphics.CopyFromScreen function. 您可以使用Graphics.CopyFromScreen函数来截取屏幕快照,直到获得整个页面。

// Get screen location of web browser
Rectangle rec = webBrowser1.RectangleToScreen(webBrowser1.ClientRectangle);
// create image to hold whats in view
Bitmap image = new Bitmap(rec.Width, rec.Height);
// get graphics to draw on image
Graphics g = Graphics.FromImage(image);
// Save into image
// From MSDN:
//public void CopyFromScreen(
//    int sourceX,
//    int sourceY,
//    int destinationX,
//    int destinationY,
//    Size blockRegionSize
//)
g.CopyFromScreen(rec.X,rec.Y,0,0,rec.Size)

You may also want to remove the scrollbars so they aren't in your image: 您可能还需要删除滚动条,使它们不在您的图像中:

webBrowser.ScrollBarsEnabled = false;
webBrowser.Document.Body.Style = "overflow:hidden;";

And then scroll down to take a shot of the next page: 然后向下滚动以拍摄下一页:

webBrowser.Document.Window.ScrollTo(x,y);

A long time ago I stumbled over this CodeProject-article ' Capture an HTML document as an image ' 很久以前,我偶然发现了此CodeProject文章“ 将HTML文档捕获为图像

however, there is a new one (posted: 13 Feb 2010) ' HTML to Image in C# ' 但是,有一个新的发布(2010年2月13日发布)“ C#中HTML到图像

I haven't tested either of them but I think they should work! 我没有测试过它们中的任何一个,但我认为它们应该可以工作!

To create the PDF, the program you're using will need the source code of the site. 要创建PDF,您正在使用的程序将需要网站的源代码。 Wether you use the WebBrowser winforms control or something else to get that info, is of no real difference. 不管您使用WebBrowser winforms控件还是其他获取该信息的控件,都没有真正的区别。

This code will get the source code of any site for you, presuming you don't need to upload stuff first: 假设您不需要先上传内容,则此代码将为您获取任何网站的源代码:

string url = "some site";
string source = string.Empty;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
using(StreamReader sr = new StreamReader(response.GetResponseStream()){
    source = sr.ReadToEnd();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM