简体   繁体   中英

PHP preg_match multiple tags in html

I need to section off my html by multiple tags. from the start of the <h1> tag up to but not including the following <h1> tag. So far the expression I have works but only retrieves the first section. looking specifically for a preg_match solution. Would ideally like the solution to be dynamic (not matter the contents between the h1 tags or how many sections there are). Let me know (kinda new to regex in general). I know this may be a tricky question as I am new to php in general.

html

<h1>heading1</h1>
<img>
<p>para1</p>
<p>para2</p>
<h1>heading2</h1>
<img>
<p>para1</p>
<p>para2</p>
<h1>heading3</h1>
<img>
<p>para1</p>
<p>para2</p>

desired output or something similar:

array{
  [0]=> <h1>heading1</h1>
        <img>
        <p>para1</p>
        <p>para2</p>

  [1]=> <h1>heading2</h1>
        <img>
        <p>para1</p>
        <p>para2</p>

  [2]=> <h1>heading3</h1>
        <img>
        <p>para1</p>
        <p>para2</p>
}

my current expression is:

$regex = '#<\/p>\s*(<h2>.*?)<h2>#s';
$preg = preg_match_all($regex, $content, $matches);
print_r ($matches);

Very thankful for any help!

preg_match_all() won't return overlapping matches. Since your regexp ends with <h2> , which is the start of the next match, it won't return that next match.

Put the start of the next match into a lookahead so it's not included in the match.

$regex = '#<\/p>\s*(<h1>.*?)(?=<h1>)#s';

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM