简体   繁体   中英

Regex to find text outside HTML tags and then replace <br> tag with <p> using PHP

I'm using an HTML5 WYSIWYG editor which doesn't seem to insert paragraph tags automatically. It does however insert br and h1, h2, etc. A basic example of the text generated looked like this:

<h1>This is a header</h1>This should be a stand alone paragraph<br><br>This paragraph   should split<br>into two lines.

I would like to know if it's possible to take everything not within an open and closing tag and put that in a paragraph, and then replace a double br with a </p> <p> to generate this:

<h1>This is a header</h1><p>This should be a stand alone paragraph</p><p>This paragraph should split<br>into two lines.</p>

Thanks in advance for the help.

As long as your input text has the same structure as the one you provided as example: <h1>This is a header</h1>This should be a stand alone paragraph<br><br>This paragraph should split<br>into two lines.

Then, you can use this regex:

^(.*<\/h1>)([^<]+)(<br>\s*<br>)(.*)$

And replace as: \\1<p>\\2</p><p>\\4</p>

Here's a working DEMO

It should be easily possible. Steps I would take:

  • Take the HTML generated
  • Use a regular expression to find text that is not within tags
  • Wrap the text in <p></p>
  • Replace all <br><br> with <p></p>
  • Replace back into the generated HTML with the new string in place of the old one
  • Repeat until there are no further chunks of text not within tags

Also, instead of regular expressions, this might be of use when trying to determine what's inside tags and what isn't: http://api.jquery.com/jQuery.parseHTML/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM