简体   繁体   English

"在 C# 中使用 Selenium RemoteWebDriver"

[英]Using Selenium RemoteWebDriver in C#

I'm trying to use the Selenium RemoteWebDriver in C#, basically all I want to do is to programmatically scrape a webpage's html after the javascript has completed manipulating the DOM (without a browser window popping up.)我正在尝试在 C# 中使用 Selenium RemoteWebDriver,基本上我想要做的就是在 javascript 完成对 DOM 的操作以编程方式抓取网页的 html(没有弹出浏览器窗口。)

First I started the selenium-server.jar like so:首先,我像这样启动了 selenium-server.jar:

C:\\Program Files\\selenium-server>java -jar "C:\\Program Files\\selenium-server\\selenium-server.jar" C:\\Program Files\\selenium-server>java -jar "C:\\Program Files\\selenium-server\\selenium-server.jar"

13:34:46.163 INFO - Java: Sun Microsystems Inc. 19.1-b02
13:34:46.166 INFO - OS: Windows 7 6.1 amd64
13:34:46.174 INFO - v2.0 [a2], with Core v2.0 [a2]
13:34:46.277 INFO - RemoteWebDriver instances should connect to: **http://127.0.0.1:4444/wd/hub**
13:34:46.278 INFO - Version Jetty/5.1.x
13:34:46.279 INFO - Started HttpContext[/selenium-server/driver,/selenium-server
/driver]
13:34:46.280 INFO - Started HttpContext[/selenium-server,/selenium-server]
13:34:46.280 INFO - Started HttpContext[/,/]
13:34:46.311 INFO - Started org.openqa.jetty.jetty.servlet.ServletHandler@6019d0
a1
13:34:46.312 INFO - Started HttpContext[/wd,/wd]
13:34:46.316 INFO - Started SocketListener on 0.0.0.0:4444
13:34:46.316 INFO - Started org.openqa.jetty.jetty.Server@199a0c7c

Next I tried to run this line from a test case:接下来,我尝试从测试用例中运行此行:

var driver = new RemoteWebDriver(new Uri("http://127.0.0.1:4444/wd/hub"), DesiredCapabilities.Chrome());

This line errors:此行错误:

Test 'Housters.Test.ScrapeTest.TestSelenium' failed: OpenQA.Selenium.WebDriverException : Unexpected error. {"message":"java.lang.NullPointerException","localizedMessage":"java.lang.NullPointerException","cause":{"class":"java.lang.NullPointerException","stackTrace":[{"fileName":"DriverFactory.java","class":"java.lang.StackTraceElement","lineNumber":43,"className":"org.openqa.selenium.remote.server.DriverFactory","methodName":"getBestMatchFor","nativeMethod":false},{"fileName":"DriverFactory.java","class":"java.lang.StackTraceElement","lineNumber":76,"className":"org.openqa.selenium.remote.server.DriverFactory","methodName":"newInstance","nativeMethod":false},{"fileName":"Session.java","class":"java.lang.StackTraceElement","lineNumber":48,"className":"org.openqa.selenium.remote.server.Session$1","methodName":"call","nativeMethod":false},{"fileName":"Session.java","class":"java.lang.StackTraceElement","lineNumber":46,"className":"org.openqa.selenium.remote.server.Session$1","methodName":"call","nativeMethod":false},{"class":"java.lang.StackTraceElement","lineNumber":-1,"className":"java.util.concurrent.FutureTask$Sync","methodName":"innerRun","nativeMethod":false},{"class":"java.lang.StackTraceElement","lineNumber":-1,"className":"java.util.concurrent.FutureTask","methodName":"run","nativeMethod":false},{"class":"java.lang.StackTraceElement","lineNumber":-1,"className":"java.util.concurrent.ThreadPoolExecutor$Worker","methodName":"runTask","nativeMethod":false},{"class":"java.lang.StackTraceElement","lineNumber":-1,"className":"java.util.concurrent.ThreadPoolExecutor$Worker","methodName":"run","nativeMethod":false},{"class":"java.lang.StackTraceElement","lineNumber":-1,"className":"java.lang.Thread","methodName":"run","nativeMethod":false}]},"class":"java.util.concurrent.ExecutionException","stackTrace":[{"class":"java.lang.StackTraceElement","lineNumber":-1,"className":"java.util.concurrent.FutureTask$Sync","methodName":"innerGet","nativeMethod":false},{"class":"java.lang.StackTraceElement","lineNumber":-1,"className":"java.util.concurrent.FutureTask","methodName":"get","nativeMethod":false},{"fileName":"Session.java","class":"java.lang.StackTraceElement","lineNumber":68,"className":"org.openqa.selenium.remote.server.Session","methodName":"execute","nativeMethod":false},{"fileName":"Session.java","class":"java.lang.StackTraceElement","lineNumber":54,"className":"org.openqa.selenium.remote.server.Session","methodName":"<init>","nativeMethod":false},{"fileName":"DriverSessions.java","class":"java.lang.StackTraceElement","lineNumber":76,"className":"org.openqa.selenium.remote.server.DriverSessions","methodName":"newSession","nativeMethod":false},{"fileName":"NewSession.java","class":"java.lang.StackTraceElement","lineNumber":46,"className":"org.openqa.selenium.remote.server.handler.NewSession","methodName":"handle","nativeMethod":false},{"fileName":"ResultConfig.java","class":"java.lang.StackTraceElement","lineNumber":144,"className":"org.openqa.selenium.remote.server.rest.ResultConfig","methodName":"handle","nativeMethod":false},{"fileName":"DriverServlet.java","class":"java.lang.StackTraceElement","lineNumber":271,"className":"org.openqa.selenium.remote.server.DriverServlet","methodName":"handleRequest","nativeMethod":false},{"fileName":"DriverServlet.java","class":"java.lang.StackTraceElement","lineNumber":256,"className":"org.openqa.selenium.remote.server.DriverServlet","methodName":"doPost","nativeMethod":false},{"fileName":"HttpServlet.java","class":"java.lang.StackTraceElement","lineNumber":727,"className":"javax.servlet.http.HttpServlet","methodName":"service","nativeMethod":false},{"fileName":"HttpServlet.java","class":"java.lang.StackTraceElement","lineNumber":820,"className":"javax.servlet.http.HttpServlet","methodName":"service","nativeMethod":false},{"fileName":"ServletHolder.java","class":"java.lang.StackTraceElement","lineNumber":428,"className":"org.openqa.jetty.jetty.servlet.ServletHolder","methodName":"handle","nativeMethod":false},{"fileName":"ServletHandler.java","class":"java.lang.StackTraceElement","lineNumber":677,"className":"org.openqa.jetty.jetty.servlet.ServletHandler","methodName":"dispatch","nativeMethod":false},{"fileName":"ServletHandler.java","class":"java.lang.StackTraceElement","lineNumber":568,"className":"org.openqa.jetty.jetty.servlet.ServletHandler","methodName":"handle","nativeMethod":false},{"fileName":"HttpContext.java","class":"java.lang.StackTraceElement","lineNumber":1530,"className":"org.openqa.jetty.http.HttpContext","methodName":"handle","nativeMethod":false},{"fileName":"HttpContext.java","class":"java.lang.StackTraceElement","lineNumber":1482,"className":"org.openqa.jetty.http.HttpContext","methodName":"handle","nativeMethod":false},{"fileName":"HttpServer.java","class":"java.lang.StackTraceElement","lineNumber":909,"className":"org.openqa.jetty.http.HttpServer","methodName":"service","nativeMethod":false},{"fileName":"HttpConnection.java","class":"java.lang.StackTraceElement","lineNumber":820,"className":"org.openqa.jetty.http.HttpConnection","methodName":"service","nativeMethod":false},{"fileName":"HttpConnection.java","class":"java.lang.StackTraceElement","lineNumber":986,"className":"org.openqa.jetty.http.HttpConnection","methodName":"handleNext","nativeMethod":false},{"fileName":"HttpConnection.java","class":"java.lang.StackTraceElement","lineNumber":837,"className":"org.openqa.jetty.http.HttpConnection","methodName":"handle","nativeMethod":false},{"fileName":"SocketListener.java","class":"java.lang.StackTraceElement","lineNumber":245,"className":"org.openqa.jetty.http.SocketListener","methodName":"handleConnection","nativeMethod":false},{"fileName":"ThreadedServer.java","class":"java.lang.StackTraceElement","lineNumber":357,"className":"org.openqa.jetty.util.ThreadedServer","methodName":"handle","nativeMethod":false},{"fileName":"ThreadPool.java","class":"java.lang.StackTraceElement","lineNumber":534,"className":"org.openqa.jetty.util.ThreadPool$PoolThread","methodName":"run","nativeMethod":false}]}
    at OpenQA.Selenium.Remote.RemoteWebDriver.UnpackAndThrowOnError(Response errorResponse)
    at OpenQA.Selenium.Remote.RemoteWebDriver.Execute(DriverCommand driverCommandToExecute, Dictionary`2 parameters)
    at OpenQA.Selenium.Remote.RemoteWebDriver.StartSession(ICapabilities desiredCapabilities)
    at OpenQA.Selenium.Remote.RemoteWebDriver..ctor(Uri remoteAddress, ICapabilities desiredCapabilities)
    ScrapeTest.cs(36,0): at Housters.Test.ScrapeTest.TestSelenium()

In the service console window, it shows this error:在服务控制台窗口中,它显示此错误:

13:44:55.558 INFO - WebDriver remote server: Executing: [new session: null] at URL: /session)
13:44:55.560 INFO - WebDriver remote server: Exception: java.lang.NullPointerException

I'm trying to do this from Windows 7 x64.我正在尝试从 Windows 7 x64 执行此操作。 What am I doing wrong?我究竟做错了什么? This seems like a lot of work for what I want to do...这似乎是我想做的很多工作......

I see that you already have an accepted answer, but because it answers completely different question, I will add my 5 cents. 我看到你已经有了一个接受的答案,但因为它回答了完全不同的问题,我将加5美分。

So if you want to connect to a remove web driver, instead of the line: 因此,如果要连接到删除Web驱动程序,而不是行:

var driver = new FirefoxDriver();

You have to do: 你必须做:

var driver = new RemoteWebDriver(new Uri("http://localhost:4444/wd/hub"), DesiredCapabilities.Firefox()); // instead of this url you can put the url of your remote hub

Do not forget to include remote namespace before the previous line: 不要忘记在上一行之前包含远程命名空间:

using OpenQA.Selenium.Remote;

If you are attempting to run Selenium2 on localhost, you don't need to use RemoveWebDriver() and Selenium Server - You can use the following: 如果您尝试在localhost上运行Selenium2,则不需要使用RemoveWebDriver()和Selenium Server - 您可以使用以下命令:

WebDriver driver = new ChromeDriver();

I have found less issues running locally in this way than when using RemoveWebDriver, and you should find you get more information readily available if there is a problem. 我发现在本地运行的问题少于使用RemoveWebDriver时出现的问题,如果出现问题,您应该会发现可以获得更多信息。

Alternatively you can use HtmlUnit directly as described at http://blog.stevensanderson.com/2010/03/30/using-htmlunit-on-net-for-headless-browser-automation/ 或者,您可以直接使用HtmlUnit,如http://blog.stevensanderson.com/2010/03/30/using-htmlunit-on-net-for-headless-browser-automation/所述。

Getting the HTML of the page you're interested in is easy, particularly in .NET (use HttpWebRequest). 获取您感兴趣的页面的HTML非常简单,特别是在.NET中(使用HttpWebRequest)。 Parsing it into a DOM-like structure and applying the transforms indicated by the JavaScript on the page, not so much. 将其解析为类似DOM的结构并应用页面上JavaScript指示的转换,而不是如此。 It requires a browser, or at the very least, an HTML parser with a DOM construction engine, as well as a JS script engine to manipulate the resulting DOM. 它需要一个浏览器,或者至少需要一个带有DOM构造引擎的HTML解析器,以及一个用于操作生成的DOM的JS脚本引擎。 It's not a trivial problem. 这不是一个小问题。 At this point, your choices are: (1) use HtmlUnit, env.js, or one of the other "headless" browser projects, none of which require WebDriver to do what you want to do; 此时,您的选择是:(1)使用HtmlUnit,env.js或其他一个“无头”浏览器项目,其中任何一个都不需要WebDriver来做您想做的事情; or (2) live with a browser window popping up. 或者(2)弹出一个浏览器窗口。 And in the case of (2), using the raw InternetExplorerDriver or FirefoxDriver (or ChromeDriver in the upcoming 2.0b4) is a simpler choice than using the remote server. 在(2)的情况下,使用原始的InternetExplorerDriver或FirefoxDriver(或即将推出的2.0b4中的ChromeDriver)比使用远程服务器更简单。

Since Chrome 59 came available you are able to run the browser in headless mode.由于 Chrome 59 可用,您可以在无头模式下运行浏览器。 See this thread看到这个线程

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM