正则表达式PCRE表达式

Question

我有一段像以下一样的HTML代码：

<td width="24%"><b>Something</b></td>
          <td width="1%"></td>
          <td width="46%" align="center">
           <p><b>
    needed
  value</b></p>
          </td>
          <td width="28%" align="center">
            &nbsp;</td>
        </tr>

什么是一个很好的正则表达式来提取字后的第一个文本节点（不是标签，但里面的文字） Something我的意思是我想提取

     needed
  value

仅此而已。

我无法弄清楚php中正在运行的正则表达式模式。

编辑：我没有解析整个HTML文档，但几行，所以我想要的是使用正则表达式，没有HTML解析器。

Answer 1

忽略使用正则表达式解析HTML的潜在问题，以下模式应与您的示例代码匹配：

Something(?:(?:<[^>]+>)|\s)*([\w\s*]+)

这将匹配Something ，然后是HTML标签（或空白）的任何列表，并匹配下一个文本块\\w （包括空格）。

您可以在PHP的preg_match()方法中使用它，如下所示：

if (preg_match('/Something(?:(?:<[^>]+>)|\s)*([\w\s*]+)/', $inputString, $match)) {
    $matchedValue = $match[1];
    // do whatever you need
}

正则表达式解释：

Something         # has to start with 'Something'
(?:               # non-matching group
    (?:           # non-matching group
        <[^>]+>   # any HTML tags, <...>
    )
    | \s          # OR whitespace
)*                # this group can match 0+ times
(
    [\w\s*]+      # any non-HTML words (with/without whitespace)
)

正则表达式PCRE表达式

问题描述

1 个解决方案

解决方案1
4 已采纳 2012-10-04 17:33:32

正则表达式PCRE表达式

问题描述

1 个解决方案

解决方案1 4 已采纳 2012-10-04 17:33:32

解决方案1
4 已采纳 2012-10-04 17:33:32