简体   繁体   中英

How to count rows in a table in an html file C#

When there is a compound table inside an html file how can one count the rows of the parent table.

What I mean by a compound table; a table in which other tables are contained within some of its cells.

Here is my attempt at coding. Note I receive an incorrect values:

        String htmlFile = "C:/Temp/Test_13.html";
        HtmlDocument doc = new HtmlDocument();
        doc.Load(htmlFile);

        HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
        HtmlNodeCollection rows = tables[1].SelectNodes(".//tr");
        Console.WriteLine(" Rows in second (Parent) table: " + rows.Count());

Please indicate which namespace is used in your answer.

Here is a representative sample file:

<html>
<body>
<table border="1">
<tr>
<td>Apps</td>
</tr>
<tr>
<td>Offcie Web Apps</td>
</tr>
</table>
<br/>
<table border="1">
<tr>
<td>Application</td>
<td>Status</td>
<td>Instances</td>
</tr>
<tr>
<td>PowerPoint</td>
<td>Online</td>
<td>
    <table border="1">
    <tr>
        <td>Server1</td>
        <td>Online</td>
    </tr>
    <tr>
        <td>Server2</td>
        <td>Disabled</td>
    </tr>
    </table>
</td>
</tr>
<tr>
<td>Word</td>
<td>Online</td>
<td>
    <table border="1">
    <tr>
        <td>Server1</td>
        <td>Online</td>
    </tr>
    <tr>
        <td>Server2</td>
        <td>Disabled</td>
    </tr>
    </table>
</td>
</tr>
</table>
</body>
</html>

Thank you.

您可以将每个<table><tr>推送到堆栈,当遇到</table> - 弹出,直到从堆栈中弹出表。

I would recommend you try the csQuery nuget package. It's designed to take most of the headaches out of doing things exactly like that. You can use the css selector query syntax, which most web devs are quite familiar with. In this case, you could probably get away with body > table:nth-of-type(2) > tr and it will return an array of all the tr's, then just count them, or check the length of the resulting array. Alternatively, body > table ~ table > tr would work as well from the sample you gave as would br + table > tr

If I understood correctly this is what you want.

int i = 1;
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
foreach (HtmlNode table in tables)
{
    var tmp = table.ParentNode;
    if (tmp.OriginalName.Contains("td"))
        MessageBox.Show("The parent of table #" + i + " has" + tmp.ParentNode.ParentNode.Elements("tr").Count().ToString() + " rows.");
    i++;
}

The MessageBox will pop up 2 times:

"The parent of table #3 has 3 rows."
"The parent of table #4 has 3 rows."

EDIT (ANSWERING QUESTIONS):

1) I started counter from int i = 1 . The var i = 1 will be the same thing, it just automatically replace var with int .

2) I edited code now you will have same result with me

3) I started counting from 1 so you have table #1, table #2, table #3 and table #4. Your 2 last tables (table #3 and #4) are sub-tables of table #2, table #2 have 3 rows. My above code print only tables that are sub-tables of some table. Can you show me what you want as answer?

EDIT 2:

int i = 1;
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
foreach (HtmlNode table in tables)
{
    if (!table.ParentNode.OriginalName.Contains("td")) // If table is not sub-table
        MessageBox.Show("Table #" + i + " have " + table.Elements("tr").Count().ToString() + " rows.");
    i++;
}

The MessageBox will pop up 2 times:

"The parent of table #1 has 2 rows."
"The parent of table #2 has 3 rows."

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM