简体   繁体   中英

Regex Repeating Pattern to Capture All HTML Table Column Contents

I am attempting to capture all column contents within HTML tables. I'm very close, but my regex is only capturing the first column of each table. What do I need to do to capture all of the columns?

Here is my regex and HTML: https://regex101.com/r/jA3sS6/1

Don't use regular expression, use a Parser instead!

Start with this:

$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom->loadHTML( $html );
$xpath = new DOMXPath( $dom );

To retrieve all <td> :

foreach( $dom->GetElementsByTagName( 'td' ) as $td )
{
    echo $td->nodeValue . PHP_EOL;
}

To retrieve all <td class="large-text"> :

foreach( $xpath->query( '//td[@class="large-text"]' ) as $td )
{
    echo $td->nodeValue . PHP_EOL;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM