preg_match_all刮擦html标签之间找到的单词

Question

I have the following piece of code which should match the provided string to $contents. 我有以下代码应将提供的字符串与$ contents相匹配。 $contents variable has a web page contents stored through file_get_contents() function: $ contents变量具有通过file_get_contents（）函数存储的网页内容：

if (preg_match('~<p style="margin-top: 40px; " class="head">GENE:<b>(.*?)</b>~iU', $contents, $match)){
                    $found_match = $match[1];
                }

The original string on the said webpage looks like this: 所述网页上的原始字符串如下所示：

<p style="margin-top: 40px; " class="head">GENE:<b>TSPAN6</b>

I would like to match and store the string 'TSPAN6' found on the web page through (.*?) into $match[1]. 我想将通过（。*？）在网页上找到的字符串'TSPAN6'匹配并存储到$ match [1]中。 However, the matching does not seem to work. 但是，匹配似乎不起作用。 Any ideas? 有任何想法吗？

Answer 1

Unfortunately, your suggestion did not work. 不幸的是，您的建议没有用。

After some hours of looking through the html code I have realized that the regex simply had a blank space right after the colon. 经过几个小时的html代码查看，我意识到正则表达式在冒号后面只是有一个空格。 As such, the code snippet now looks like this: 因此，现在的代码片段如下所示：

$pattern = '#GENE: <b>(.*)</b>#i';
preg_match($pattern1, $contents, $match1);
if (isset($match1[1]))
{
    $found_flag = $match1[1];
}

Answer 2

Try this: 尝试这个：

preg_match( '#GENE:<b>([^<]+)</b>si#', $contents, $match );
$found_match = ( isset($match[1]) ? $match[1] : false );

preg_match_all刮擦html标签之间找到的单词

问题描述

2 个解决方案

解决方案1
1 已采纳 2012-09-20 07:16:31

解决方案2
0 2012-08-10 23:24:36

preg_match_all刮擦html标签之间找到的单词

问题描述

2 个解决方案

解决方案1 1 已采纳 2012-09-20 07:16:31

解决方案2 0 2012-08-10 23:24:36

解决方案1
1 已采纳 2012-09-20 07:16:31

解决方案2
0 2012-08-10 23:24:36