I've created a webbrowser in C# and I want to be able to select part of the web page and have the source appear in a text box. So far all I've managed to do is get the whole page's source using:
private void btnSource_Click(object sender, EventArgs e) { string PageSource; mshtml.HTMLDocument objHtmlDoc = (mshtml.HTMLDocument)webBrowser1.Document.DomDocument; PageSource = objHtmlDoc.documentElement.innerHTML; rTBSource.Text = PageSource; }
private void btnSource_Click(object sender, EventArgs e) { string PageSource; mshtml.HTMLDocument objHtmlDoc = (mshtml.HTMLDocument)webBrowser1.Document.DomDocument; PageSource = objHtmlDoc.documentElement.innerHTML; rTBSource.Text = PageSource; }
This is way more information than I need. I'm only looking for one small part of the page at a time.
Using the string.contains method will be problematic because the text on the web page contains a number of super-scripted characters. Normal copying and pasting turns the super-scripted characters into regular characters that I cannot get rid of via regexp.
If I can work with the source, I would have better luck getting the a and other tags eliminated.
Any suggestions?
Compiler: C# 2010 express App: WinForm OS: XP sp3
try this
HtmlElementCollection elm = webBrowser1.Document.Body.All;
in elm you will have all the elements of the body of the webpage and you can get the text of the third element for examole like this
elm[2].innerhtml
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.