简体   繁体   中英

How to specify the second table in an html file for C#

I need to get the sum of the 4th row of the second table. I want to sum beginning with the second column. How do I do this?

    static void Main()
    {
        String htmlFile = "C:/Temp/Test_11.html";

        HtmlDocument doc = new HtmlDocument();
        doc.Load(htmlFile);

        //var sum = doc.DocumentNode.SelectSingleNode("//table")  // <<<< No error when I access the first table 
        var sum = doc.DocumentNode.SelectSingleNode("//table[2]")  // <<<< Error when I try to access the 2nd table
            .Elements("tr")
            // Skip this many rows from the top
            .Skip(1)
            // .ElementAt(2) = third column
            .Sum(tr => int.Parse(tr.Elements("td").ElementAt(2).InnerText));
        Console.WriteLine(sum);

        Console.ReadLine();
    }

Below is the html file consisting of two tables. The result of the sum should be 26.

<html>
<head>
<title>Tables</title>
</head>
<body>
<table border="1">
  <tr>
    <th>Environment1</th>
    <th>Databases</th>
    <th>Sites</th>
    <th>Site Collection Storage Used (GB)</th>
    <th>Ref</th>
</th>
  </tr>
  <tr>
    <td>Public1</td>
    <td>14</td>
    <td>28</td>
    <td>32.6602</td>
    <td>2</td>
  </tr>
  <tr>
    <td>Local1</td>
    <td>4</td>
    <td>9</td>
    <td>21.0506</td>
    <td>1</td>
  </tr>
  <tr>
    <td>Shared1</td>
    <td>6</td>
    <td>9</td>
    <td>17.092</td>
    <td>9</td>
  </tr>
</table>
<p></p>
<table border="1">
  <tr>
    <th>Environment2</th>
    <th>Databases</th>
    <th>Sites</th>
    <th>Site Collection Storage Used (GB)</th>
    <th>Ref</th>
 </th>
  </tr>
  <tr>
    <td>Public2</td>
    <td>15</td>
    <td>13</td>
    <td>31.5602</td>
    <td>1</td>
  </tr>
  <tr>
    <td>Local2</td>
    <td>5</td>
    <td>8</td>
    <td>7.0302</td>
    <td>3</td>
  </tr>
  <tr>
    <td>Shared2</td>
    <td>4</td>
    <td>5</td>
    <td>13.109</td>
    <td>4</td>
  </tr>
</table>

</body>
</html>

Please help me with this

The following sample does the same, the only concern is that I have hardcoded the positions and the elements in the xPath, which you can freely change to accommodate you requirement.

var tbl = xdoc.SelectNodes("//table[2]//tr[4]//td[position()>1]");

// In order to select the 1st table, the following statement can be used.
//var tbl = xdoc.SelectNodes("//table[2]//tr[4]//td[position()>1]");

int sum = 0;

foreach (XmlNode item in tbl) 
{
     decimal value = 0;
     if (decimal.TryParse(item.InnerText, out value))
     {
         sum += (int)value;
     }
 }

 Console.WriteLine("Number of (Web Application) sites: " + sum);

There are couple of points to note. You have values in decimal and have treated them like an integer. Hence I have used the decimal temporary variable to get all values and then done integer casting. This is however not the best practice. You have to look into that based on your exact requirement.

Additionally, i have used the TryParse so that only the items that can be parsed can be used for the calculation.

Please share your understanding here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM