简体   繁体   中英

Scraping using php - preg_match_all

Trying to get the value of Internet Data Volume Balance - the script should echo 146.30mb

New to all these, having a look at all the tutorials.

How can this be done?

<tr >
    <td bgcolor="#F8F8F8"><div align="left"><B><FONT class="tplus_text">Account Status</FONT></B></div></td>
    <td bgcolor="#FFFFFF"><div align="left"><FONT class="tplus_text">You exceeded your allowed credit.</FONT></div></td>
</tr> 

<tr >
    <td bgcolor="#F8F8F8"><div align="left"><B><FONT class="tplus_text">Period Free Time Remaining</FONT></B></div></td>
    <td bgcolor="#FFFFFF"><div align="left"><FONT class="tplus_text">0:00:00 hours</FONT></div></td>
</tr> 

<tr >
    <td bgcolor="#F8F8F8"><div align="left"><B><FONT class="tplus_text">Internet Data Volume Balance</FONT></B></div></td>
    <td bgcolor="#FFFFFF"><div align="left"><FONT class="tplus_text" style="text-transform:none;">146.30 MB</FONT></div></td>
</tr> 

PHP can interact with the DOM just like JavaScript can. This is vastly superior to parsing the markup, as most people will tell you is the wrong approach anyway:

Loading from an HTML File

// Start by creating a new document
$doc = new DOMDocument();
// I've loaded the table into an external file, and am loading it into the $doc
$doc->loadHTMLFile( 'htmlpage.html' );
// Since you have six table cells, I'm calling up all of them
$cells = $doc->getElementsByTagName("td");
// I'm grabbing the sixth cell's textContent property
echo $cells->item(5)->textContent;

This code will output "146.30 MB" to the screen.

Loading from a String

If you have the HTML stored within a string, you can load that into your document as well. We'll change the method used to load the file, into the method used to load from a string:

$str = "<table><tr><td>Foo</td></tr>...</table>";
$doc->loadHTML( $str );

We would then proceed with the same code as above to select the cells, and show their textContent in the output.

Check out the DOMDocument Class.

If you were willing to or have already installed phpQuery, you can use that.

phpQuery::newDocumentFileHTML('htmlpage.html');
echo pq('td:eq(6)')->text();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM