简体   繁体   中英

preg_match_all to scrape found word between html tags

I have the following piece of code which should match the provided string to $contents. $contents variable has a web page contents stored through file_get_contents() function:

if (preg_match('~<p style="margin-top: 40px; " class="head">GENE:<b>(.*?)</b>~iU', $contents, $match)){
                    $found_match = $match[1];
                }

The original string on the said webpage looks like this:

<p style="margin-top: 40px; " class="head">GENE:<b>TSPAN6</b>

I would like to match and store the string 'TSPAN6' found on the web page through (.*?) into $match[1]. However, the matching does not seem to work. Any ideas?

Unfortunately, your suggestion did not work.

After some hours of looking through the html code I have realized that the regex simply had a blank space right after the colon. As such, the code snippet now looks like this:

$pattern = '#GENE: <b>(.*)</b>#i';
preg_match($pattern1, $contents, $match1);
if (isset($match1[1]))
{
    $found_flag = $match1[1];
}

Try this:

preg_match( '#GENE:<b>([^<]+)</b>si#', $contents, $match );
$found_match = ( isset($match[1]) ? $match[1] : false );

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM