简体   繁体   English

解析此HTML表的最快方法是什么?

[英]What's the fastest way to parse this HTML table?

I'm trying to parse a table in Node that I get from a website. 我正在尝试解析从网站获得的Node中的表。 The table looks like this. 该表如下所示。 I want to ignore the header and parseonly the actual transaction bodies. 我想忽略标题,而parseonly实际的交易主体。

        <tbody><tr class="dgHeader" style="font-weight:bold;">
            <th scope="col">Reference 1</th><th scope="col">Reference 2</th><th scope="col">Reference 3</th><th scope="col">Reference 4</th><th scope="col">Gross Amount</th><th scope="col">Discounts/Surcharges</th><th scope="col">Net Amount</th><th scope="col">Means of Payment</th><th scope="col">Form of Payment</th><th scope="col">Payment Folio</th><th scope="col">Branch</th><th scope="col">Time</th><th scope="col">Maturity Date</th><th scope="col">Payment date</th>             </tr><tr align="left">
            <td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia1">0000000000000000000000000000000X4D649G66</span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia2"></span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia3"></span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia4"></span>
                    </td><td align="right">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblImporteBruto">$40.00</span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblDescuentosRecargos">$0.00</span>
                    </td><td align="right">
                    <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblImporteNeto">$40.00</span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblMedioPago">Internet</span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFormaPago">Cash</span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFolioPago">45786172008896142466 </span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblSucursal">4578</span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblHora">01:48:59 p.m.</span>
                    </td><td>
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFechaVencimiento">00/00/0000</span>
                    </td><td align="left">
                        <span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFechaPago">20/06/2016</span>
                    </td>           </tr>       </tbody>

I've been using Cheerio, but having a hard time getting the id tags to get the data from the table. 我一直在使用Cheerio,但很难获取id标签以从表中获取数据。

This ended up solving it, and allowing me to obtain the reference code pretty easily. 最终解决了它,使我可以轻松获得参考代码。

$ = cheerio.load(str, {
    ignoreWhitespace: true
  });

$('tr').each(function(i, tr){   
    var reference = $('#ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia1').text())
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM