繁体   English   中英

使用Xpath和PHP从表中收集数据?

[英]Scraping data from the table using Xpath and PHP?

我想从下表中提取数据,表的标记如下所示。我正在使用Xpath从表中提取数据,但也欢迎其他建议。

      <div style="clear:both;" id="showPrice">
      <br>
      <table cellspacing="1">
         <tbody>
             <tr>
                 <td width="50px" style="text-align: left" class="tdhead">SN</td>
                 <td width="650px" style="text-align: left" class="tdhead">Companies</td>
                 <td width="20px" class="tdhead">Trans</td>
                 <td width="50px" class="tdhead"> Max Price</td>
                 <td width="50px" class="tdhead">Min Price</td>
                 <td width="50px" class="tdhead">Closing Price</td>
                 <td width="50px" class="tdhead">Total Shares</td>
                 <td width="50px" class="tdhead">Amount Rs.</td>
                 <td width="50px" class="tdhead">Prev. Closing</td>
                 <td width="20px" class="tdhead">Diff.</td>
                 <td width="50px" class="tdhead">Diff. %</td>
                 <td colspan="3" class="closing-price">
                     <table>
                         <tbody>
                            <tr>
                               <td colspan="3">365&nbsp;days</td>
                             </tr>
                             <tr>
                               <td width="50px" class="closing-price-lighter">Max Price</td>
                               <td width="50px" class="closing-price-lighter">Min Price</td>
                               <td width="50px" class="closing-price-lighter">Avg</td>    
                             </tr>
                         </tbody>
                     </table>
                     </td>
                    </tr>
                    <tr style="background-color: #A61A00">
                       <td style="text-align: center;color:white;">1</td>
                       <td style="text-align: left;padding:3px;">
                          <a href="viewcompany.php?symbol=ACEDBL&amp;id=177" style="text-decoration:none;color:white;">Ace Development Bank Limited</a>
                       </td>
                       <td class="numeric-data">3</td>
                       <td class="numeric-data">269.00</td>
                       <td class="numeric-data">264.00</td><td class="numeric-data" style="background-color:#99CCFF;color:black;">264.00</td>
                       <td class="numeric-data">495</td>
                       <td class="numeric-data">131,405</td>
                       <td class="numeric-data">265.00</td>
                       <td class="numeric-data">-1.00</td>
                       <td class="numeric-data" style="background-color:#99CCFF;color:black;">-0.38</td>
                       <td class="numeric-data" style="background-color:#99FFFF;color:black;">281</td>
                       <td class="numeric-data" style="background-color:#99FFFF;color:black;">102</td>
                       <td class="numeric-data" style="background-color:#99FFFF;color:black;">150.15</td>       
                   </tr>
               </tbody>
            </table>
         </div>

我只希望收盘价类之后的数据。 我需要的数据是tr的以下td中的文本和数值:

                       <td style="text-align: left;padding:3px;">
                          <a href="viewcompany.php?symbol=ACEDBL&amp;id=177" style="text-decoration:none;color:white;">Ace Development Bank Limited</a>
                       </td>
                       <td class="numeric-data">3</td>
                       <td class="numeric-data">269.00</td>
                       <td class="numeric-data">264.00</td><td class="numeric-data" style="background-color:#99CCFF;color:black;">264.00</td>
                       <td class="numeric-data">495</td>
                       <td class="numeric-data">131,405</td>
                       <td class="numeric-data">265.00</td>
                       <td class="numeric-data">-1.00</td>
                       <td class="numeric-data" style="background-color:#99CCFF;color:black;">-0.38</td>
                       <td class="numeric-data" style="background-color:#99FFFF;color:black;">281</td>
                       <td class="numeric-data" style="background-color:#99FFFF;color:black;">102</td>
                       <td class="numeric-data" style="background-color:#99FFFF;color:black;">150.15</td>       
                   </tr>   

我尝试了以下表达式,但无法获得结果:

  //div[@id='showPrice']/td[preceding-sibling::td[@class='closing-price']]/text()

您还可以执行以下操作:

将其指向该特定的<tr>标记:

$html_string = file_get_contents('http://www.sharesansar.com/today.php');
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html_string);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$values = array();
$row = $xpath->query('//div[@id="showPrice"]/table[1]/tr[2]/td');
foreach($row as $value) {
    $values[] = trim($value->textContent);
}

echo '<pre>';
print_r($values);

结果:

Array
(
    [0] => 1
    [1] => Ace Development Bank Limited
    [2] => 3
    [3] => 269.00
    [4] => 264.00
    [5] => 264.00
    [6] => 495
    [7] => 131,405
    [8] => 265.00
    [9] => -1.00
    [10] => -0.38
    [11] => 281
    [12] => 102
    [13] => 150.15
)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM