简体   繁体   English

将多行换成段落

[英]Convert multiple new lines to paragraphs

I would like to find paragraphs from a string, and format them, What i have kind of works, but It doesn't work 100%. 我想从一个字符串中找到段落,并将其格式化,我有什么样的作品,但它不能100%工作。

So, I have this string that looks like this: 所以,我有这个看起来像这样的字符串:

##Chapter 1

Once upon a time there was a little girl named sally, she went to school.

One day it was awesome!

##Chapter 2

We all had a parade!

I am formatting the string, by converting ##... to <H2> 's, and it now looks like this: 我正在通过将##...转换为<H2>来格式化字符串,现在看起来像这样:

<h2>Chapter 1</h2>

Once upon a time there was a little girl named sally, she went to school.

One day it was awesome!

<h2>Chapter 2</h2>

We all had a parade!

Now I want to convert everything to a paragraph, and to do so I do this: 现在,我想将所有内容都转换为一个段落,然后执行以下操作:

// Converts sections to paragraphs:
$this->string = preg_replace("/(^|\n\n)(.+?)(\n\n|$)/", "<p>$2</p>", $this->string);

// To Remove paragraph tags from header tags (h1,h2,h3,h4,h5,h6,h7):
$this->string = preg_replace("/<p><h(\d)>(.+?)<\/h\d><\/p>/i", "<h$1>$2</h$1>", $this->string);

And this is the final output (New Lines added for readability): 这是最终输出(为便于阅读,添加了新行):

<h2>Chapter 1</h2>
Once upon a time there was a little girl named sally, she went to school.
<p>One day it was awesome!</p>
<h2>Chapter 2</h2>
<p>We all had a parade!</p>

As I said near the beginning, this doesn't work 100%, and as you can see a paragraph was not added to the first paragraph. 正如我在开始时所说的那样,这并不能100%起作用,并且您可以看到没有在第一段中添加一段。 What Can I do to improve the regular expression? 我该怎么做才能改善正则表达式?

you can do it in one step: 您可以一步完成:

$this->string = preg_replace('~(*BSR_ANYCRLF)\R\R\K(?>[^<\r\n]++|<(?!h[1-6]\b)|\R(?!\R))+(?=\R\R|$)~u',
                             '<p>$0</p>', $this->string);

pattern details 图案细节

(*BSR_ANYCRLF)       # \R can be any type of newline
\R\R                 # two newlines
\K                   # reset the match
(?>                  # open an atomic group
    [^<\r\n]++       # all characters except <, CR, LF
  |                  # OR
    <(?!h[1-6]\b)    # < not followed by a header tag
  |                  # OR
    \R(?!\R)         # single newline
)+                   # close the atomic group and repeat one or more times
(?=\R\R|$)           # followed by to newlines or the end of the string

Add m switch to the first regex. 将m开关添加到第一个正则表达式。

// Converts sections to paragraphs:
$this->string = preg_replace("/(^|\n\n)(.+?)(\n\n|$)/m", "<p>$2</p>", $this->string);

// To Remove paragraph tags from header tags (h1,h2,h3,h4,h5,h6,h7):
$this->string = preg_replace("/<p><h(\d)>(.+?)<\/h\d><\/p>/i", "<h$1>$2</h$1>", $this->string);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM