简体   繁体   English

C#无头浏览器,具有对搜寻器的JavaScript支持

[英]c# headless browser with javascript support for crawler

谁能建议支持.NET的无头浏览器,该浏览器支持Cookie和合法的javascript执行?

Selenium+HtmlUnitDriver/GhostDriver is exactly what you are looking for. Selenium + HtmlUnitDriver / GhostDriver正是您想要的。 Oversimplified, Selenium is library for using variety of browsers for automation purposes - testing, scraping, task automation. 简而言之,Selenium是一个库,用于将各种浏览器用于自动化目的-测试,抓取,任务自动化。

There are different WebDriver classes with which you can operate an actual browser. 您可以使用不同的WebDriver类来操作实际的浏览器。 HtmlUnitDriver is a headless one. HtmlUnitDriver是无头的。 GhostDriver is a WebDriver for PhantomJS, so you can write C# while actually PhantomJS will do the heavy lifting. GhostDriver是PhantomJS的WebDriver,因此您可以编写C#,而实际上PhantomJS可以完成繁重的工作。

Code snippet from Selenium docs for Firefox, but code with GhostDriver (PhantomJS) or HtmlUnitDriver is almost identical. Firefox的Selenium文档中的代码段,但使用GhostDriver(PhantomJS)或HtmlUnitDriver的代码几乎相同。

using OpenQA.Selenium;
using OpenQA.Selenium.Firefox;
using OpenQA.Selenium.Support.UI;

class GoogleSuggest
{
    static void Main(string[] args)
    {
        // driver initialization varies across different drivers
        // but they all support parameter-less constructors
        IWebDriver driver = new FirefoxDriver();
        driver.Navigate().GoToUrl("http://www.google.com/");


        IWebElement query = driver.FindElement(By.Name("q"));
        query.SendKeys("Cheese");
        query.Submit();

        WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(10));
        wait.Until((d) => { return d.Title.ToLower().StartsWith("cheese"); });

        System.Console.WriteLine("Page title is: " + driver.Title);

        driver.Quit();
    }
}

If you run this on Windows machine you can use actual Firefox/Chrome driver because it will open an actual browser window which will operate as programmed in your C#. 如果您在Windows计算机上运行此程序,则可以使用实际的Firefox / Chrome驱动程序,因为它将打开一个实际的浏览器窗口,该窗口将按照C#中的程序进行操作。 HtmlUnitDriver is the most lightweight and fast. HtmlUnitDriver是最轻巧,最快捷的。

I have successfully ran Selenium for C# (FirefoxDriver) on Linux using Mono . 我已经使用Mono在Linux上成功运行了Selenium for C#(FirefoxDriver)。 I suppose HtmlUnitDriver will also work as fine as the others, so if you require speed - I suggest you go for Mono (you can develop, test and compile with Visual Studio on Windows, no problem) + Selenium HtmlUnitDriver running on Linux host without desktop. 我想HtmlUnitDriver也可以像其他版本一样工作,所以如果您需要速度-我建议您选择Mono(可以在Windows上使用Visual Studio开发,测试和编译,没问题)+在没有台式机的Linux主机上运行的Selenium HtmlUnitDriver 。

I am not aware of a .NET based headless browser but there is always PhantomJS which is C/C++ and it works fairly well for assisting in unit testing of JS with QUnit. 我不知道基于.NET的无头浏览器,但总是有PhantomJS(它是C / C ++),并且在协助使用QUnit进行JS的单元测试中效果很好。

There is also another relevant question here which might help you - Headless browser for C# (.NET)? 这里还有一个相关的问题可能会对您有所帮助-C#(.NET)的无头浏览器?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM