繁体   English   中英

Selenium C# drive.PageSource - '太长,或指定路径的组件太长。

[英]Selenium C# drive.PageSource - 'is too long, or a component of the specified path is too long.'

我试图将 driver.PageSource 从 Selenium C# 传递给 HTML Agility Pack,但是这行代码htmlDoc.Load(driver.PageSource); 返回错误: '...' 太长,或者指定路径的一个组件太长。

ps Selenium Python and Beautiful Soup 不会产生这个错误,当我试图用 Python 而不是 C# 做同样的事情时。

如何解决这个问题?

完整代码:

using System;
using System.Threading;
using HtmlAgilityPack;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Support.UI;

namespace SeleniumSharp
{
    public static class WebScraping
    {
        public static void GetPageData()
        {
            // initial setup
            IWebDriver driver = new ChromeDriver();
            driver.Navigate().GoToUrl("<url>");

            // dropdown
            var dropdown1 = driver.FindElement(By.Id("cpMain_ucc1_ctl00_liResidentialFront"));
            dropdown1.Click();
            
            // enter search query
            var search = driver.FindElement(By.Id("cpMain_ucc1_ctl00_txtResidentialSearchBox"));
            search.Click();
            search.SendKeys("london");
            Thread.Sleep(3000);

            // submit search
            var submit = driver.FindElement(By.XPath("//div[@id='cpMain_ucc1_ctl00_pnlContentResidential']//a[@class='search-button']"));
            submit.Click();

            // Html Agility Pack
            HtmlDocument htmlDoc = new HtmlDocument();
            htmlDoc.Load(driver.PageSource);

            var address = htmlDoc.DocumentNode
                .SelectNodes("//div[@class='grid-address']")
                .ToList();

            foreach(var item in address)
            {
                Console.WriteLine(item.InnerText);
            }

        }

        
    }
}

这行代码返回错误:

htmlDoc.Load(driver.PageSource);

错误:

'<html source>'is too long, or a component of the specified path is too long.
at System.IO.PathHelper.GetFullPathName(ReadOnlySpan`1 path, ValueStringBuilder& builder)
   at System.IO.PathHelper.Normalize(String path)
   at System.IO.Path.GetFullPath(String path)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)  
   at System.IO.StreamReader..ctor(String path, Encoding encoding)
   at HtmlAgilityPack.HtmlDocument.Load(String path)

这是因为您使用的是Load而不是LoadHtml方法。 Load 方法使用包含 HTML 的文件路径,而不是 HTML 源代码 (driver.PageSource)。

// From File
var doc = new HtmlDocument();
doc.Load(filePath);

// From String
var doc = new HtmlDocument();
doc.LoadHtml(html);

所以尝试使用

htmlDoc.LoadHtml(driver.PageSource);

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM