简体   繁体   中英

Replacing regular line breaks and unicode line breaks

I have text with paragraph formats, a date is always above each paragraph article. The problem is after each article, there is unknown line breaks that are different kind of unicode line breaks. I need to remove every instance of the line breaks between each paragraph and replace it with two \\n\\n .

So from this

05/12
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It 
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.




11/01
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It 
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.

To this

05/12
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It 
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.

11/01
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It 
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.

I tried using preg_replace() but it's not matching every instance?

$text = preg_replace('/\r?\n+(?=\d{2}\/\d{2})/', "\n\n", $text);

I posted on a similar question about this a month or so back.

To match anything considered a linebreak sequence, you can use \\R

\\R matches a generic newline; that is, anything considered a linebreak sequence by Unicode. This includes all characters matched by \\v (vertical whitespace) and the multi character sequence \\x0D\\x0A.

Try this instead.

$text = preg_replace('~\R+(?=\d{2}/\d{2})~u', "\n\n", $text);

See the PCRE documentation on different ways to implement this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM