如何使用preg_match_all获取html标签内容

Question

i have some html codes, which contains these : 我有一些html代码，其中包含这些代码：

<table class="qprintable2" width="100%" cellpadding="4" cellspacing="0" border="0">
content goes here !
</table>

i have this function to match the tag inside 我有这个功能来匹配里面的标签

function getTextBetweenTags($string, $tagname)
{
  $pattern = "/<table class=\"class1\" width=\"100%\" cellpadding=\"4\" cellspacing=\"0\" border=\"0\">(.*?)<\/$tagname>/"; 
  preg_match_all($pattern, $string, $matches);
  return $matches[1];
}

but it doesn't have, so i will be highly appreciate if you can give me a good pattern for this :( 但是它没有，所以如果您能给我一个好的模式，我将不胜感激:(

Answer 1

You should avoid this, but you can use a regex like: 您应该避免这种情况，但是可以使用如下正则表达式：

preg_match('#<table[^>]+>(.+?)</table>#ims', $str);

The various tricks here are: 这里的各种技巧是：

/ims modifier so that "." /ims修饰符，使“。” also matches newlines, case-insensitive, multiline options (^ and $) 还匹配换行符，不区分大小写的多行选项（^和$）
using # instead of / for enclosing the regex, so you don't have to escape html closing tags 使用#而不是/来封闭正则表达式，因此您不必转义html结束标记
using [^>]+ to make it unspecific and avoid listing individual html attributes (more reliable) 使用[^>]+使其不确定，并避免列出单个html属性（更可靠）

While this is a case where regexs would work okayish, the general consensus is that you should use QueryPath or phpQuery (or alike) to extract html. 虽然在这种情况下，正则表达式可以正常工作，但通常的共识是您应该使用QueryPath或phpQuery（或类似方式）提取html。 It's also mucho simpler: 它也更简单：

qp($html)->find("table")->text();  //would return just the text content

如何使用preg_match_all获取html标签内容

问题描述

1 个解决方案

解决方案1
3 2010-12-18 02:38:05

如何使用preg_match_all获取html标签内容

问题描述

1 个解决方案

解决方案1 3 2010-12-18 02:38:05

解决方案1
3 2010-12-18 02:38:05