简体   繁体   English

"如何通过类属性解析重复的 HTML 元素?"

[英]How can I parse repeated HTML elements by their class attribute?

I'm trying to parse an HTML file with basically the same tags.我正在尝试解析具有基本相同标签的 HTML 文件。

I want to get this output:我想得到这个输出:

BTC - Bitcoin, BEP20(BSC), Bitcoin(Segwit) BTC - 比特币、BEP20(BSC)、比特币(Segwit)

ETH - ERC20, BEP20(BSC), POLYGON, ARBITRUM, AURORA, MATISEVM ETH - ERC20, BEP20(BSC), POLYGON, ARBITRUM, AURORA, MATISEVM

USDT - OMNI,TRC20,ERC20,BEP20(BSC),HECO,POLYGON,FTM, AVAX-C ,ARBITRUM,METISEVM USDT - OMNI,TRC20,ERC20,BEP20(BSC),HECO,POLYGON,FTM, AVAX-C,ARBITRUM,METISEVM

QASH - ERC20 QASH - ERC20

Here is a sample of the HTML:以下是 HTML 示例:

<div data-v-326d86f4="" class="table-box">
   <table data-v-326d86f4="">
      <tr data-v-326d86f4="">
         <td data-v-326d86f4="">BTC</td>
         <td data-v-326d86f4="" class="block-chain">
            <div data-v-326d86f4="" class="chain_box"><span data-v-326d86f4="" class="chain_name">Bitcoin</span> <span data-v-326d86f4=""><i data-v-326d86f4="" class="fa fa-caret-down"></i></span></div>
            <div data-v-326d86f4="" class="select-list"><span data-v-326d86f4="">Bitcoin</span><span data-v-326d86f4="">BEP20(BSC)</span><span data-v-326d86f4="">Bitcoin(SegWit)</span></div>
         </td>
         <td data-v-326d86f4="">0.001</td>
         <td data-v-326d86f4="">0.002</td>
      </tr>
      <tr data-v-326d86f4="">
         <td data-v-326d86f4="">ETH</td>
         <td data-v-326d86f4="" class="block-chain">
            <div data-v-326d86f4="" class="chain_box"><span data-v-326d86f4="" class="chain_name">ERC20</span> <span data-v-326d86f4=""><i data-v-326d86f4="" class="fa fa-caret-down"></i></span></div>
            <div data-v-326d86f4="" class="select-list"><span data-v-326d86f4="">ERC20</span><span data-v-326d86f4="">BEP20(BSC)</span><span data-v-326d86f4="">POLYGON</span><span data-v-326d86f4="">ARBITRUM</span><span data-v-326d86f4="">AURORA</span><span data-v-326d86f4="">METISEVM</span></div>
         </td>
         <td data-v-326d86f4="">0.012</td>
         <td data-v-326d86f4="">0.024</td>
      </tr>
      <tr data-v-326d86f4="">
         <td data-v-326d86f4="">USDT</td>
         <td data-v-326d86f4="" class="block-chain">
            <div data-v-326d86f4="" class="chain_box"><span data-v-326d86f4="" class="chain_name">OMNI</span> <span data-v-326d86f4=""><i data-v-326d86f4="" class="fa fa-caret-down"></i></span></div>
            <div data-v-326d86f4="" class="select-list"><span data-v-326d86f4="">OMNI</span><span data-v-326d86f4="">TRC20</span><span data-v-326d86f4="">ERC20</span><span data-v-326d86f4="">BEP20(BSC)</span><span data-v-326d86f4="">HECO</span><span data-v-326d86f4="">POLYGON</span><span data-v-326d86f4="">FTM</span><span data-v-326d86f4="">AVAX-C</span><span data-v-326d86f4="">ARBITRUM</span><span data-v-326d86f4="">METISEVM</span></div>
         </td>
         <td data-v-326d86f4="">30</td>
         <td data-v-326d86f4="">50</td>
      </tr>
      <tr data-v-326d86f4="">
         <td data-v-326d86f4="">QASH</td>
         <td data-v-326d86f4="" class="block-chain">
            <div data-v-326d86f4="" class="chain_box">
               <span data-v-326d86f4="" class="chain_name">ERC20</span> <!---->
            </div>
            <!---->
         </td>
         <td data-v-326d86f4="">513</td>
         <td data-v-326d86f4="">1026</td>
      </tr>
      <!-- ... -->

I'm using the HtmlAgilityPack library without success:我正在使用HtmlAgilityPack库但没有成功:

Dim arqHtml As String = "C:\Users\Mattia\Desktop\ready.html"
Dim myHtml As HtmlAgilityPack.HtmlDocument = New HtmlAgilityPack.HtmlDocument()
myHtml.Load(arqHtml)
Dim myTable As HtmlAgilityPack.HtmlNode = myHtml.DocumentNode.SelectSingleNode("//table")
Dim myRows As HtmlAgilityPack.HtmlNodeCollection = myTable.SelectNodes("tr")
For Each tmpRow As HtmlAgilityPack.HtmlNode In myRows
    Dim myCells As HtmlAgilityPack.HtmlNodeCollection = tmpRow.SelectNodes("td")
    If myCells IsNot Nothing Then
        Dim myToken As String = myCells(0).InnerText
        Dim mySpans As HtmlAgilityPack.HtmlNodeCollection = myCells(1).SelectNodes("div[contains(@class,'select-list')]/span")
        If mySpans IsNot Nothing Then
            Dim myListBChain As New List(Of String)
            For Each mySpan As HtmlAgilityPack.HtmlNode In mySpans
                RichTextBox1.Text += mySpan.InnerText
            Next
            Dim allItensAsString = String.Join(", ", richtextbox1.text)
        End If
    End If
Next

This returns this output:这将返回此输出:

BitcoinBEP20(BSC)Bitcoin(SegWit)ERC20BEP20(BSC)POLYGONARBITRUMAURORAMETISEVMOMNITRC20ERC20BEP20(BSC)HECOPOLYGONFTMAVAX-CARBITRUMMETISEVMEOSBEP20(BSC)ERC20BEP20(BSC)TRC20BEP20(BSC)ZILBEP20(BSC)NEOLEGACYNEON3ERC20POLYGONERC20DAGBEP2BEP20(BSC)FTMAVAX-CERC20BEP20(BSC)ERC20BEP20(BSC)ERC20HECOBEP20(BSC)ERC20HECOERC20POLYGONERC20HECOERC20POLYGONERC20BEP20(BSC)BCHBEP20(BSC)ERC20LOOPPOLYGONBEP20(BSC)FTMAVAX-CMETISEVMERC20TOLERC20METAERC20BEP20(BSC) BitcoinBEP20(BSC)比特币(SegWit)ERC20BEP20(BSC)POLYGONARBITRUMAURORAMETISEVMOMNITRC20ERC20BEP20(BSC)HECOPOLYGONFTMAVAX-CARBITRUMMETISEVMEOSBEP20(BSC)ERC20BEP20(BSC)TRC20BEP20(BSC)ZILBEP20(BSC)NEOLEGACYNEON3ERC20POLYGONERC20DAGBEP2BEP20(BSC)FTMAVAX-CERC20BEP20(BSC)ERC20BEP20(BSC)ERC20HECOBEP20( BSC)ERC20HECOERC20POLYGONERC20HECOERC20POLYGONERC20BEP20(BSC)BCHBEP20(BSC)ERC20LOOPPOLYGONBEP20(BSC)FTMAVAX-CMETISEVMERC20TOLERC20METAERC20BEP20(BSC)

How do I make it return the output I want?如何让它返回我想要的输出?

Incorporating my comment<\/a> on the original issue<\/a> , in the last <tr><\/code> in the sample...结合我<\/a>对原始问题的<\/a>评论,在示例中的最后一个<tr><\/code> ...

<tr data-v-326d86f4="">
    <td data-v-326d86f4="">QASH</td>
    <td data-v-326d86f4="" class="block-chain">
    <div data-v-326d86f4="" class="chain_box">
        <span data-v-326d86f4="" class="chain_name">ERC20</span> <!---->
    </div>
    <!---->
    </td>
    <td data-v-326d86f4="">513</td>
    <td data-v-326d86f4="">1026</td>
</tr>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM