简体   繁体   English

HTMLAgilityPack C#获取所有节点和子节点

[英]HTMLAgilityPack C# Get all nodes and subnodes

I am trying to scrape some data off this webpage and having some trouble doing so. 我正在尝试从该网页上抓取一些数据,并且在操作时遇到了一些麻烦。 I would like to only obtain 3 node data, 1 for Team Name, 1 for points, and 1 for position. 我只想获取3个节点数据,其中1个用于团队名称,1个用于积分,1个用于位置。 So an example of the console output would like similar to this: 因此,控制台输出的示例将类似于以下内容:

Uta 23.52 Centers 犹他州23.52中心
Uta 29.22 Power Forwards Uta 29.22大前锋
Uta 29.86 Point Guards UT 29.86控球后卫
Uta 26.22 Small Forward UT 26.22小前锋
Uta 26.61 Shooting Guard 尤塔26.61得分后卫

I have devised the code below but the foreach loops are duplicating the data, seems to be assigning each value to each position,to each point etc. Any help would be greatly appreciated! 我已经设计了下面的代码,但是foreach循环正在复制数据,似乎正在将每个值分配给每个位置,每个点等。任何帮助将不胜感激!

 private void button1_Click(object sender, EventArgs e)
    {
        try
        {
            var doc = new HtmlWeb().Load("https://www.sportingcharts.com/nba/defense-vs-position/");
            HtmlAgilityPack.HtmlNodeCollection teams = doc.DocumentNode.SelectNodes("//div[@class='col col-md-3']//tr/td[2]");
            HtmlAgilityPack.HtmlNodeCollection points = doc.DocumentNode.SelectNodes(".//div[@class='col col-md-3']//tr/td[3]");
            HtmlAgilityPack.HtmlNodeCollection positions = doc.DocumentNode.SelectNodes(".//div[@class='col col-md-3']//span[1]");

            List<Record> lstRecords = new List<Record>();
            foreach (HtmlAgilityPack.HtmlNode teamnode in teams)
            {
                foreach (HtmlAgilityPack.HtmlNode pointsnode in points)
                {
                    foreach (HtmlAgilityPack.HtmlNode positionnode in positions)

                        Console.WriteLine(teamnode.InnerText + ' ' + pointsnode.InnerText + ' ' + positionnode.InnerText);

                }


            }
        }
        catch { }

    }

Your main problem is the approach with the foreach, what you are telling your code is for each team, give me all the points, and for each point give me all the positions . 您的主要问题是使用foreach的方法,您要告诉您的代码是针对每个团队的,给我所有的分数,为每一点给我的所有职位 Since the team points and the points are the same my approach will be done with for, where it gets tricky is with the positions, but again, you know that every position only has 30 rows. 由于团队的得分和得分是相同的,因此我的方法将采用相同的方法进行处理,但要注意的是,这些问题很棘手,但同样,您知道每个职位只有30行。

    var doc = new HtmlWeb().Load("https://www.sportingcharts.com/nba/defense-vs-position/");
    HtmlAgilityPack.HtmlNodeCollection teams = doc.DocumentNode.SelectNodes("//div[@class='col col-md-3']//tr/td[2]");
    HtmlAgilityPack.HtmlNodeCollection points = doc.DocumentNode.SelectNodes(".//div[@class='col col-md-3']//tr/td[3]");
    HtmlAgilityPack.HtmlNodeCollection positions = doc.DocumentNode.SelectNodes(".//div[@class='col col-md-3']//span[1]");

    string[] positions_aux = positions.Where(x => x.InnerText.Length >= 6).Select(y => y.InnerText).ToArray();

    for (int i = 0; i < teams.Count - 1; i++)
    {
        var aux = i / 30;
        Console.WriteLine(teams[i].InnerText + ' ' + points[i].InnerText + ' ' + positions_aux[aux]);
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM