Finding two html tags with Regular Expressions

Question

I need to pull out the content out of two paragraph tags and break it with a   tag. The input is like so

<p>
Yay
</p>
<p>
StackOverFlow
</p>

It needs to be like

<p>
Yay <br />
StackOverflow
</p>

What I have so far is <?php preg_match('/(.*)<\\/p>/', $content, $match); echo($match[1])."..."; ?> <?php preg_match('/(.*)<\\/p>/', $content, $match); echo($match[1])."..."; ?> <?php preg_match('/(.*)<\\/p>/', $content, $match); echo($match[1])."..."; ?> Which pulls the first paragraph tag only:

<p>
Yay...
</p>

Also, is it possible to set a character limit? A max of 40 characters for example from both of the paragraphs or would I have to use substr ?

Thanks!

So it turned out to be:

<?php $content = preg_replace('/<\/p>\s*<p>/', '<br/>', $content);  echo substr("$content",0,180)."..."; ?>

Answer 1

Do yourself a favor and use a HTML parser ( DOMDocument::loadHTML for example). It's easier and less fragile.

Answer 2

I think you're making it more complicated than it needs to be. Given that you want to collapse:

<p>Yay</p><p>StackOverFlow</p>

into:

<p>Yay<br />StackOverflow</p>

Then just substitute instances of  for   : preg_replace('/<\\/p>\\s*/', ' ', $input) .

In general, however, note that use of regular expressions for this kind of complex parsing is fraught with peril. More succinctly:

"Some people, when faced with a problem, think, 'I know, I'll use regular expressions.' Now they have two problems." -- Jamie Zawinski

Answer 3

My advice, Regex can only go so far. See one of my posts here: Extracting text fragment from a HTML body (in .NET)

It has string truncation regex too.

Finding two html tags with Regular Expressions

Question

3 answers

solution1
6 2009-10-27 12:39:02

solution2
4 ACCPTED 2009-10-27 12:40:10

solution3
0 2009-10-27 12:39:01

Finding *two* html tags with Regular Expressions

Question

3 answers

solution1 6 2009-10-27 12:39:02

solution2 4 ACCPTED 2009-10-27 12:40:10

solution3 0 2009-10-27 12:39:01

Finding two html tags with Regular Expressions

solution1
6 2009-10-27 12:39:02

solution2
4 ACCPTED 2009-10-27 12:40:10

solution3
0 2009-10-27 12:39:01