正则表达式重复模式以捕获所有 HTML 表格列内容

Question

I am attempting to capture all column contents within HTML tables.我正在尝试捕获 HTML 表中的所有列内容。 I'm very close, but my regex is only capturing the first column of each table.我非常接近，但我的正则表达式只捕获每个表的第一列。 What do I need to do to capture all of the columns?我需要做什么来捕获所有列？

Here is my regex and HTML: https://regex101.com/r/jA3sS6/1这是我的正则表达式和 HTML： https : //regex101.com/r/jA3sS6/1

Answer 1

Don't use regular expression, use a Parser instead!不要使用正则表达式，而是使用解析器！

Start with this:从这个开始：

$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom->loadHTML( $html );
$xpath = new DOMXPath( $dom );

To retrieve all <td> :检索所有<td> ：

foreach( $dom->GetElementsByTagName( 'td' ) as $td )
{
    echo $td->nodeValue . PHP_EOL;
}

To retrieve all <td class="large-text"> :检索所有<td class="large-text"> ：

foreach( $xpath->query( '//td[@class="large-text"]' ) as $td )
{
    echo $td->nodeValue . PHP_EOL;
}

Read more about DOMDocument阅读更多关于DOMDocument
Read more about DOMXPath阅读有关DOMXPath 的更多信息
Read why you can't parse [X]HTML with regular expressions阅读为什么不能用正则表达式解析 [X]HTML

正则表达式重复模式以捕获所有 HTML 表格列内容

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-03-30 19:42:08

正则表达式重复模式以捕获所有 HTML 表格列内容

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-03-30 19:42:08

解决方案1
1 已采纳 2016-03-30 19:42:08