简体   繁体   English

使用Selenium Webdriver C#获取DOCTYPE

[英]Getting DOCTYPE using selenium webdriver C#

I am using Selenium webdriver for UI automation purpose. 我正在使用Selenium Webdriver进行UI自动化。 Below is my sample code 下面是我的示例代码

IWebDriver driver = new OpenQA.Selenium.IE.InternetExplorerDriver();
    string url ="http://stackoverflow.com";
    driver.Navigate().GoToUrl(url);
    string pagesource = driver.PageSource;

pagesource variable does not have the doctype. pagesource变量没有doctype。 I need to know the DOCTYPE for W3C validation . 我需要了解用于W3C验证的DOCTYPE。 Is there any way to get DOCTYPE of html source through selenium? 有没有办法通过硒获取html源的DOCTYPE?

This thread shows there is no way to get the Doctype of html source through selenium, instead you can do a HTTP request from .net and get the DOCTYPE. 该线程显示无法通过Selenium获取html源的Doctype,而是可以从.net发出HTTP请求并获取DOCTYPE。 I don't want to do a seperate HTTP request for getting DOCTYPE. 我不想为获取DOCTYPE进行单独的HTTP请求。

Using FirefoxDriver instead of InternetExplorerDriver will get you the DOCTYPE. 使用FirefoxDriver代替InternetExplorerDriver将获得DOCTYPE。 Unfortunately this won't solve your problem - the source you're getting with driver.PageSource is already preprocessed by the browser, so trying to validate that code won't give reliable results. 不幸的是,这无法解决您的问题-通过驱动程序获取的源。PageSource已经由浏览器进行了预处理,因此尝试验证代码不会提供可靠的结果。

Unfortunately there are no easy solutions. 不幸的是,没有简单的解决方案。

If your page is not password protected you can use "validate by uri" method. 如果您的页面没有密码保护,则可以使用“通过uri验证”方法。

Otherwise you need to obtain page source. 否则,您需要获取页面源。 I know two ways of doing it (I implemented both in my project). 我知道这样做的两种方式(我在项目中都实现了这两种方式)。 One is to use proxy. 一种是使用代理。 If you are using C# take a look at FiddlerCore . 如果您使用的是C#,请查看FiddlerCore Other way would be to make another request using javascript and XMLHttpRequest. 其他方法是使用javascript和XMLHttpRequest发出另一个请求。 You can find example here (search the page for XMLHttpRequest). 您可以在此处找到示例 (在页面上搜索XMLHttpRequest)。

For W3C validation basically we have 3 issues if we automate through selenium webdriver. 对于W3C验证,如果我们通过Selenium Webdriver实现自动化,则基本上会遇到3个问题。

  1. Getting proper page source since driver.Pagesource is not reliable. 由于driver.Pagesource不可靠,因此无法获取正确的页面源。
  2. Getting doctype of HTML source. 获取HTML源的doctype。
  3. Dealing with controls rendered through ajax calls. 处理通过ajax调用呈现的控件。 Since we cannot access these controls in page source how do we get the exact 'Generated source' of the page? 由于我们无法在页面源代码中访问这些控件,我们如何获得页面的确切“生成源”?

All the above things can be done by executing javascript through selenium web driver. 以上所有操作都可以通过Selenium Web驱动程序执行javascript来完成。

in a text file called 'htmlsource.txt' store this below code snippet. 在名为“ htmlsource.txt”的文本文件中,将其存储在以下代码段中。


function outerHTML(node){
    // if IE, Chrome take the internal method otherwise build one as lower versions of firefox
        //does not support element.outerHTML property
  return node.outerHTML || (
      function(n){
          var div = document.createElement('div'), h;
          div.appendChild( n.cloneNode(true) );
          h = div.innerHTML;
          div = null;
          return h;
      })(node);
  }


 var outerhtml = outerHTML(document.getElementsByTagName('html')[0]);
var node = document.doctype;
var doctypestring="";
if(node)
{
     // IE8 and below does not have document.doctype and you will get null if you access it.

 doctypestring = "<!DOCTYPE "
         + node.name
         + (node.publicId ? ' PUBLIC "' + node.publicId + '"' : '')
         + (!node.publicId && node.systemId ? ' SYSTEM' : '') 
         + (node.systemId ? ' "' + node.systemId + '"' : '')
         + '>';
         }
         else

         {

             // for IE8 and below you can access doctype like this

         doctypestring = document.all[0].text;
         }
return doctypestring +outerhtml ;

And now the C# code to access the complete AJAX rendered HTML source with doctype 现在,C#代码可以使用doctype访问完整的AJAX呈现的HTML源代码


 IJavaScriptExecutor js = (IJavaScriptExecutor)driver;
            string jsToexecute =File.ReadAlltext("htmlsource.txt");
            string completeHTMLGeneratedSourceWithDoctype = (string)js.ExecuteScript(jsToexecute);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM