Join Broken Paragraphs HTMl Regex

Question

I'm trying to edit some xhtml on Sigil .

With the command

([az])

I'm able to find all paragraphs that begin with lower case . That tells me that they shouldn't be separate from the previous one. It's just a conversion issue.

What should I do to delete both the  from that paragraph and the  from the previous one in order to join the two blocks of text into one single paragraph ?

It looks something like this:

 ... that is why relationships

 are not what they should be.

And it should be:

 that is why relationships are not what they should be.

Answer 1

I'm not too sure about Sigil, but the following regex should be able to do that:

First find:

</p>\s*<p>(\s*[a-z])

The replace it with:

$1

What this means:

\\s* : Any amount of whitespace

$1 : The group () youll keep after replacing

Answer 2

Or an easiest way by checking Dot Matches All :

<p>(.+?)</p>

And then you Replace only with: $1 or /1 ( Group )

It will remain only the block of text.

(.+?) - Everything until the first entity like slashes or > etc.

(.*?) - Everything including entities . ( Careful! )

Build your regex :

if you have newlines use \\n
if you have space use \\s
if you want to exclude something use ^
if you want to use both \\n and \\s go (\\n\\s)
if you want ANY of that use * after it. Ex: \\s* ( any white space until first entity )
if you want to search by first letter go ([az]) or all letters ([az]+)
by numbers ([0-9]) or more numbers ([0-9]+)
only 2 first letters ([az]{2}) etc.
Advices :
Always USE preview or replace only the first match to see the difference.
Use them into groups with brackets ()

Hope this helps you understand better your issue.

Join Broken Paragraphs HTMl Regex

Question

2 answers

solution1
0 2015-09-23 13:52:21

solution2
0 2017-02-07 11:12:28

Join Broken Paragraphs HTMl Regex

Question

2 answers

solution1 0 2015-09-23 13:52:21

solution2 0 2017-02-07 11:12:28

solution1
0 2015-09-23 13:52:21

solution2
0 2017-02-07 11:12:28