简体   繁体   English

如何“手动”返回WebBrowser?

[英]How to “manually” go back with a WebBrowser?

I'm working on a web scraper that sometimes needs to remember a particular page, then go to some other pages and then go back to that page. 我正在研究一个有时需要记住特定页面的Web抓取工具,然后转到其他页面,然后返回到该页面。 Currently I just save the URL of the page, but that doesn't work for pages like Google Maps, where the URL is always the same. 目前我只保存页面的网址,但这对于Google地图这样的网页不起作用,其中网址始终相同。

I can see that the GoBack method does go back to the previous page, so somehow the WebBrowser remembers what the previous pages was. 我可以看到GoBack方法确实返回到上一页,因此WebBrowser以某种方式记住以前的页面是什么。 How can I do this manually? 我该如何手动完成此操作? I could count how many pages have been visited since the page I want to go back to and then call GoBack as many times as necessary, but that's pretty unreliable and un-elegant. 我可以计算GoBack想要返回的页面以来访问了多少页面,然后根据需要多次调用GoBack ,但这非常不可靠且不优雅。 So I wonder how could I implement a GoBackToAParticularPage method. 所以我想知道如何实现GoBackToAParticularPage方法。

There is one thing I think would get me closer to a solution: saving the URL of all frames and then putting them back when going back to that page. 我认为有一件事可以让我更接近解决方案:保存所有帧的URL,然后在返回该页面时将它们放回原处。 I think that would solve at lease the Google Maps problems. 我认为这将解决谷歌地图问题。 I have not tested it yet. 我还没有测试过。 I don't know exactly what would it be the proper way to do this. 我不确切知道这样做的正确方法是什么。 I would need to wait for the frames to exist before setting their URLs. 在设置URL之前,我需要等待帧存在。

You can use 您可以使用

webBrowser1.Document.Window.History.Go(x);

where x is an int signifying the relative position in the browser's history. 其中x是一个int,表示浏览器历史记录中的相对位置。

x=-2 would navigate two pages back. x = -2会导航两页。

Update : More info on HtmlHistory.Go() 更新 :有关HtmlHistory.Go()的更多信息

try this! 试试这个!

javascript:history.go(-1)" 的javascript:history.go(-1)”

I know a few things have been said, so i won't re-write that, however, if you really want to use a JavaScript method (ie: if you want to use the javascript history object instead of the webbrowser controls history object) and are wondering how, there are ways to do this. 我知道已经说了几句话,所以我不会重写它,但是,如果你真的想使用JavaScript方法(即:如果你想使用javascript历史对象而不是webbrowser控件历史对象)并且想知道如何做到这一点。 You can use .InvokeScript in .NET WB controls, or if you want pre-.NET & .NET compatible, you can use this: 您可以在.NET WB控件中使用.InvokeScript,或者如果您希望兼容.NET和.NET,则可以使用:

You can use .execScript in pre-.NET versions of WB control and current/.NET versions of WB control. 您可以在.NET控件的.NET版本和WB控件的当前/ .NET版本中使用.execScript。 You can also choose the language of the script you want to execute, ie: "JScript" or "VBScript". 您还可以选择要执行的脚本的语言,即:“JScript”或“VBScript”。 Here is the one liner: 这是一个班轮:

WebBrowser1.Document.parentWindow.execScript "alert('hello world');", "JScript" 

The good thing about using the JavaScript history object is that if you kill history information in the webbrowser control by sending the number "2" into the .navigate method, going to the page where history was cancelled in WB control will not work, but it will work in the JavaScript's history object, this is an advantage. 使用JavaScript历史记录对象的好处是,如果您通过向.navigate方法发送数字“2”来终止webbrowser控件中的历史信息,则转到WB控件中取消历史记录的页面将不起作用,但它将在JavaScript的历史对象中工作,这是一个优势。

Once again, this is just a backwards compatible supplement to the ideas discussed on this post already, including a few other tidbits not mentioned. 再一次,这只是对这篇文章所讨论的想法的向后兼容补充,包括一些未提及的其他花絮。

Let me know if i can be of further help to you since and answer was already accepted. 让我知道,如果我可以为您提供进一步的帮助,那么答案已被接受。

By javascript Location object you may achieve you task. 通过javascript Location对象,您可以实现任务。

<FORM><INPUT TYPE="BUTTON" VALUE="Go Back" 
ONCLICK="history.go(-1)"></FORM>

also check 还检查

JavaScript History Object JavaScript历史对象

for the history information 历史信息

Browser history, by design, is opaque; 根据设计,浏览器历史是不透明的; otherwise it opens a security hole: Do you really want every page you visit to have visibility as to what pages/sites you've been visiting? 否则会打开一个安全漏洞:您是否真的希望您访问的每个页面都能看到您访问过的页面/网站? Probably not. 可能不是。

To do what you want, you'll need to implement your own stack of URIs, tracking what needs to be revisited. 要做你想做的事,你需要实现自己的URI堆栈,跟踪需要重新访问的内容。

You don't want to use history.go(-1) because it is unreliable. 您不想使用history.go(-1)因为它不可靠。 But, you can't use the URL, because there are pages like GoogleMaps where the URL is always the same. 但是,您无法使用该网址,因为GoogleMaps等网页的网址始终相同。

If the URL is the same but the content is different, then it means that values to determine the page's content are being pulled from somewhere other than the URL. 如果URL相同但内容不同,则表示确定页面内容的值是从URL以外的其他位置提取的。

Where could this be? 这可能在哪里?

Your most likely suspect is the posted form-collection, but data could also be coming from the cookie. 您最可能的嫌疑人是已发布的表单集合,但数据也可能来自Cookie。

I think it makes a lot more sense to index the absolute location than a relative location, because as you noted, relative locations can be unreliable. 我认为索引绝对位置比相对位置更有意义,因为正如您所指出的,相对位置可能不可靠。 The problem is that you need to get all the data that is being sent to the web server, to understand what its actual absolute location is (because the URI is not sufficient). 问题是您需要获取发送到Web服务器的所有数据,以了解其实际绝对位置(因为URI不够)。

The way to do this is to create a local copy of the page, and replace the submission url (this could be in a link, a form or in the javascript), with a URL on your server. 执行此操作的方法是创建页面的本地副本,并使用服务器上的URL替换提交URL(可以在链接,表单或javascript中)。 Then when you click something on the GoogleMaps page to trigger a change (that seems not to affect the URL), you will receive that data on your server, and will be able to determine the actual location. 然后,当您单击GoogleMaps页面上的某些内容以触发更改(这似乎不会影响URL)时,您将在服务器上收到该数据,并且能够确定实际位置。

Think about it like a querystring. 把它想象成一个查询字符串。

If I have 如果我有

<form action="http://myhost.com/page.html" method="get">
   <input type="hidden" name="secret_location_parameter" value="mrbigglesworth" />
   <input type="submit" />
</form>

and I click the submit button, I get taken to the url 然后我点击提交按钮,我会被带到网址

 http://myhost.com/page.html?secret_location_parameter=mrbigglesworth

However, If I have 但是,如果我有

<form action="http://myhost.com/page.html" method="post">
   <input type="hidden" name="secret_location_parameter" value="mrbigglesworth" />
   <input type="submit" />
</form>

and I click the submit button, then I get taken to the url 然后我点击提交按钮,然后我被带到网址

 http://myhost.com/page.html

The server still receives secret_location_parameter=mrbigglesworth , but it gets it as a form value instead of a querystring value, so it isn't visible from the url. 服务器仍然接收secret_location_parameter=mrbigglesworth ,但它将其作为表单值而不是查询字符串值,因此从URL secret_location_parameter=mrbigglesworth它。 The server might render a different page depending on the secret_location_parameter value, but not change the url, and if a post method is used, then it will appear that multiple pages reside at the same url. 服务器可能会根据secret_location_parameter值呈现不同的页面,但不会更改url,如果使用post方法,则会出现多个页面驻留在同一个url中。

My point is that you may be addressing the problem from the wrong angle, because you didn't understand what was going on under the hood. 我的观点是,你可能从错误的角度解决问题,因为你不明白引擎盖下发生了什么。 I am certainly making assumptions, but based on the way you asked your question I think this may be helpful for you 我当然在做假设,但根据你提出问题的方式,我认为这可能对你有所帮助

如果您不需要直观地看到发生的事情,可能会有更优雅的方法来使用WebClient类导航和解析URL,或许详细说明您的特定程序会产生更清晰的结果。

Assuming that you have a webbrowser control on a form and you are trying to implement go back. 假设您在表单上有一个webbrowser控件,并且您正在尝试实现返回。

Following is the solution. 以下是解决方案。 (If the assumption is wrong. Please correct me) (如果假设错了。请纠正我)

Add a webbrowser, textbox, button as btnBack 添加webbrowser,文本框,按钮为btnBack

History variable also has the url data for navigation(but not used currently). 历史变量还具有用于导航的URL数据(但当前未使用)。

C# solution C#解决方案

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;

namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
    public Form1()
    {
        InitializeComponent();
    }
    private void Form1_Load(object sender, EventArgs e)
    {
         WebBrowser1.Url = new Uri("http://maps.google.com");
    }
    Stack< String> History = new Stack<String>();

    private void WebBrowser1_Navigating(object sender, WebBrowserNavigatingEventArgs e)
    {
            TextBox1.Text = e.Url.ToString();
            History.Push(e.Url.ToString());
    }

    private void btnBack_Click(object sender, EventArgs e)
    {
        if(WebBrowser1.CanGoBack) 
        {
            WebBrowser1.GoBack();
        }

    }

}
}

Vb solution Vb解决方案

Public Class Form1
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
    WebBrowser1.Url = New Uri("http://maps.google.com")
End Sub

Private Sub WebBrowser1_Navigating(ByVal sender As Object, ByVal e As System.Windows.Forms.WebBrowserNavigatingEventArgs) Handles WebBrowser1.Navigating
    TextBox1.Text = e.Url.ToString
    History.Push(e.Url.ToString)
End Sub
Dim History As New Stack(Of String)
Private Sub btnBack_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnBack.Click
    If WebBrowser1.CanGoBack Then
        WebBrowser1.GoBack()
    End If
End Sub

End Class

Programmatically add a marker element to the DOM for those pages you will later want to go back to. 以编程方式将标记元素添加到DOM中,以用于稍后要返回的页面。 When backtracking through the browser history, check for that marker after each history.go(-1) and stop when you encounter it. 回溯浏览器历史记录时,请在每个history.go(-1)之后检查该标记,并在遇到它时停止。 This might prove unreliable in some cases, in which case remembering the depth level may serve as a backup approach. 在某些情况下,这可能证明是不可靠的,在这种情况下,记住深度级别可以作为备用方法。

You may need to experiment with the right time to insert the element, to make sure it is properly recorded in the history. 您可能需要尝试插入元素的正确时间,以确保它在历史记录中正确记录。

In case anyone else can benefit from it, here is how I ended up doing it. 如果其他人可以从中受益,这就是我最终如何做到这一点。 The only caveat is that if the travel log to has too many pages in between, the entry might not exist any more. 唯一需要注意的是,如果旅行日志之间有太多页面,则该条目可能不再存在。 There is probably a way to increase the history size, but since there have to be some limit, I use the TravelLog.GetTravelLogEntries method to see whether the entry still exists or not and if not, use the URL instead. 可能有一种方法可以增加历史记录大小,但由于必须有一些限制,我使用TravelLog.GetTravelLogEntries方法来查看条目是否仍然存在,如果不存在,请使用URL。

Most of this code came from PInvoke . 大部分代码来自PInvoke

using System;
using System.Runtime.InteropServices;
using System.Windows.Forms;
using System.Collections.Generic;

namespace TravelLogUtils
{
    [ComVisible(true), ComImport()]
    [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
    [GuidAttribute("7EBFDD87-AD18-11d3-A4C5-00C04F72D6B8")]
    public interface ITravelLogEntry
    {
        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int GetTitle([Out] out IntPtr ppszTitle); //LPOLESTR LPWSTR

        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int GetURL([Out] out IntPtr ppszURL); //LPOLESTR LPWSTR
    }

    [ComVisible(true), ComImport()]
    [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
    [GuidAttribute("7EBFDD85-AD18-11d3-A4C5-00C04F72D6B8")]
    public interface IEnumTravelLogEntry
    {
        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int Next(
            [In, MarshalAs(UnmanagedType.U4)] int celt,
            [Out] out ITravelLogEntry rgelt,
            [Out, MarshalAs(UnmanagedType.U4)] out int pceltFetched);

        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int Skip([In, MarshalAs(UnmanagedType.U4)] int celt);

        void Reset();

        void Clone([Out] out ITravelLogEntry ppenum);
    }

    public enum TLMENUF
    {
        /// <summary>
        /// Enumeration should include the current travel log entry.
        /// </summary>
        TLEF_RELATIVE_INCLUDE_CURRENT = 0x00000001,
        /// <summary>
        /// Enumeration should include entries before the current entry.
        /// </summary>
        TLEF_RELATIVE_BACK = 0x00000010,
        /// <summary>
        /// Enumeration should include entries after the current entry.
        /// </summary>
        TLEF_RELATIVE_FORE = 0x00000020,
        /// <summary>
        /// Enumeration should include entries which cannot be navigated to.
        /// </summary>
        TLEF_INCLUDE_UNINVOKEABLE = 0x00000040,
        /// <summary>
        /// Enumeration should include all invokable entries.
        /// </summary>
        TLEF_ABSOLUTE = 0x00000031
    }

    [ComVisible(true), ComImport()]
    [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
    [GuidAttribute("7EBFDD80-AD18-11d3-A4C5-00C04F72D6B8")]
    public interface ITravelLogStg
    {
        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int CreateEntry([In, MarshalAs(UnmanagedType.LPWStr)] string pszUrl,
            [In, MarshalAs(UnmanagedType.LPWStr)] string pszTitle,
            [In] ITravelLogEntry ptleRelativeTo,
            [In, MarshalAs(UnmanagedType.Bool)] bool fPrepend,
            [Out] out ITravelLogEntry pptle);

        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int TravelTo([In] ITravelLogEntry ptle);

        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int EnumEntries([In] int TLENUMF_flags, [Out] out IEnumTravelLogEntry ppenum);

        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int FindEntries([In] int TLENUMF_flags,
        [In, MarshalAs(UnmanagedType.LPWStr)] string pszUrl,
        [Out] out IEnumTravelLogEntry ppenum);

        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int GetCount([In] int TLENUMF_flags, [Out] out int pcEntries);

        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int RemoveEntry([In] ITravelLogEntry ptle);

        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int GetRelativeEntry([In] int iOffset, [Out] out ITravelLogEntry ptle);
    }

    [ComImport, ComVisible(true)]
    [Guid("6d5140c1-7436-11ce-8034-00aa006009fa")]
    [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
    public interface IServiceProvider
    {
        [return: MarshalAs(UnmanagedType.I4)]
        [PreserveSig]
        int QueryService(
            [In] ref Guid guidService,
            [In] ref Guid riid,
            [Out] out IntPtr ppvObject);
    }

    public class TravelLog
    {
        public static Guid IID_ITravelLogStg = new Guid("7EBFDD80-AD18-11d3-A4C5-00C04F72D6B8");
        public static Guid SID_STravelLogCursor = new Guid("7EBFDD80-AD18-11d3-A4C5-00C04F72D6B8");

        //public static void TravelTo(WebBrowser webBrowser, int 
        public static ITravelLogEntry GetTravelLogEntry(WebBrowser webBrowser)
        {
            int HRESULT_OK = 0;

            SHDocVw.IWebBrowser2 axWebBrowser = (SHDocVw.IWebBrowser2)webBrowser.ActiveXInstance;
            IServiceProvider psp = axWebBrowser as IServiceProvider;
            if (psp == null) throw new Exception("Could not get IServiceProvider.");

            IntPtr oret = IntPtr.Zero;            
            int hr = psp.QueryService(ref SID_STravelLogCursor, ref IID_ITravelLogStg, out oret);            
            if ((oret == IntPtr.Zero) || (hr != HRESULT_OK)) throw new Exception("Failed to query service.");

            ITravelLogStg tlstg = Marshal.GetObjectForIUnknown(oret) as ITravelLogStg;
            if (null == tlstg) throw new Exception("Failed to get ITravelLogStg");            
            ITravelLogEntry ptle = null;

            hr = tlstg.GetRelativeEntry(0, out ptle);

            if (hr != HRESULT_OK) throw new Exception("Failed to get travel log entry with error " + hr.ToString("X"));

            Marshal.ReleaseComObject(tlstg);
            return ptle;
        }

        public static void TravelToTravelLogEntry(WebBrowser webBrowser, ITravelLogEntry travelLogEntry)
        {
            int HRESULT_OK = 0;

            SHDocVw.IWebBrowser2 axWebBrowser = (SHDocVw.IWebBrowser2)webBrowser.ActiveXInstance;
            IServiceProvider psp = axWebBrowser as IServiceProvider;
            if (psp == null) throw new Exception("Could not get IServiceProvider.");

            IntPtr oret = IntPtr.Zero;
            int hr = psp.QueryService(ref SID_STravelLogCursor, ref IID_ITravelLogStg, out oret);
            if ((oret == IntPtr.Zero) || (hr != HRESULT_OK)) throw new Exception("Failed to query service.");

            ITravelLogStg tlstg = Marshal.GetObjectForIUnknown(oret) as ITravelLogStg;
            if (null == tlstg) throw new Exception("Failed to get ITravelLogStg");

            hr = tlstg.TravelTo(travelLogEntry);

            if (hr != HRESULT_OK) throw new Exception("Failed to travel to log entry with error " + hr.ToString("X"));

            Marshal.ReleaseComObject(tlstg);
        }

        public static HashSet<ITravelLogEntry> GetTravelLogEntries(WebBrowser webBrowser)
        {
            int HRESULT_OK = 0;

            SHDocVw.IWebBrowser2 axWebBrowser = (SHDocVw.IWebBrowser2)webBrowser.ActiveXInstance;
            IServiceProvider psp = axWebBrowser as IServiceProvider;
            if (psp == null) throw new Exception("Could not get IServiceProvider.");

            IntPtr oret = IntPtr.Zero;
            int hr = psp.QueryService(ref SID_STravelLogCursor, ref IID_ITravelLogStg, out oret);
            if ((oret == IntPtr.Zero) || (hr != HRESULT_OK)) throw new Exception("Failed to query service.");

            ITravelLogStg tlstg = Marshal.GetObjectForIUnknown(oret) as ITravelLogStg;
            if (null == tlstg) throw new Exception("Failed to get ITravelLogStg");

            //Enum the travel log entries
            IEnumTravelLogEntry penumtle = null;
            tlstg.EnumEntries((int)TLMENUF.TLEF_ABSOLUTE, out penumtle);
            hr = 0;
            ITravelLogEntry ptle = null;
            int fetched = 0;
            const int MAX_FETCH_COUNT = 1;

            hr = penumtle.Next(MAX_FETCH_COUNT, out ptle, out fetched);
            Marshal.ThrowExceptionForHR(hr);

            HashSet<ITravelLogEntry> results = new HashSet<ITravelLogEntry>();

            for (int i = 0; 0 == hr; i++)
            {
                if (ptle != null) results.Add(ptle);
                hr = penumtle.Next(MAX_FETCH_COUNT, out ptle, out fetched);
                Marshal.ThrowExceptionForHR(hr);
            }

            Marshal.ReleaseComObject(penumtle);
            Marshal.ReleaseComObject(tlstg);

            return results;
        }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM