简体   繁体   English

从 html 源文件中提取文本值

[英]extracting values of text from html source file

in this code var TempTxt holds An Html Body Content as string how can i extract element <table> or <td> inner text/ html using lambada syntax ?在此代码中 var TempTxt将 Html 正文内容作为字符串我如何使用 Lambada 语法提取元素<table><td>内部文本/ html?

    public  string  ExtractPageValue(IWebDriver DDriver, string url="") 
    {
        if(string.IsNullOrEmpty(url))
        url = @"http://www.boi.org.il/he/Markets/ExchangeRates/Pages/Default.aspx";
        var service = InternetExplorerDriverService.CreateDefaultService(directory);
        service.LogFile = directory + @"\seleniumlog.txt";
        service.LoggingLevel = InternetExplorerDriverLogLevel.Trace;

        var options = new InternetExplorerOptions();
        options.IntroduceInstabilityByIgnoringProtectedModeSettings = true;

        DDriver = new InternetExplorerDriver(service, options, TimeSpan.FromSeconds(60));
        DDriver.Navigate().GoToUrl(url);
        var TempTxt = DDriver.PageSource;
        return "";//Math.Round(Convert.ToDouble( TempTxt.Split(' ')[10]),2).ToString();

    }

If you are open to try HtmlAgilityPack如果您愿意尝试HtmlAgilityPack

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var table = doc.DocumentNode.SelectNodes("//table/tr")
               .Select(tr => tr.Elements("td").Select(td => td.InnerText).ToList())
               .ToList();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM