如何使用正则表达式检测PHP中的列表或枚举

Question

I'm getting data from an XML feed. 我从XML提要中获取数据。 I can't control the feed nor it's content. 我无法控制供稿，也不可以满足其内容。

Sometimes, the data contains a list / enumeration. 有时，数据包含一个列表/枚举。 I want to parse this as a clean HTML unordered list. 我想将其解析为干净的HTML无序列表。

The data I receive will be in a format like this: 我收到的数据将采用以下格式：

<p>Some text in a paragraph tag</p>
<p>
- List item one <br>
- List-item-two<br>
-List item three  <br>
- Listitem four<br>
</p>
<p>Another paragraph with text, and maybe even more paragraphs after this one!
They might even contain - dashes - - -  or <br><br> breaks!</p>

Note that not every list item is neatly formatted. 请注意，并非每个列表项都经过整齐的格式化。 Some contain trailing paces between the <br> tag or between the dash and the text. 有些包含在<br>标记之间或破折号与文本之间的尾随节奏。

How can I postprocess this in PHP to get this result: 我如何在PHP中对此进行后处理以获得结果：

<p>Some text in a paragraph tag</p>
<p><ul>
    <li>List item one</li>
    <li>List-item-two</li>
    <li>List item three</li>
    <li>Listitem four</li>
</ul></p>
<p>Another paragraph with text, and maybe even more paragraphs after this one! 
They might even contain - dashes - - -  or <br><br> breaks!</p>

Can I do it with a regular expression? 我可以使用正则表达式吗？ If so, what would it look like? 如果是这样，它将是什么样？

Answer 1

Yes, I think regex are a good start point. 是的，我认为正则表达式是一个很好的起点。 Have a look to preg_replace 看看preg_replace

The regex could be something like this (not tested) : 正则表达式可能是这样的（未经测试）：

$li = preg_replace('/^-([a-z]+)(<br>)?$/i', '<li>$1</li>', $entry);

Of course this is not working (you need support for whitespace and so on), but I think you get the idea. 当然这是行不通的（您需要对空格的支持等等），但是我想您已经明白了。

Answer 2

You can get started by replacing ^-\\s*\\b(.+)\\b\\s*<br>$ with <li>$1</li> . 您可以通过将^-\\s*\\b(.+)\\b\\s*<br>$替换为<li>$1</li> 。 I'll leave the hard part of wrapping it all in a <ul/> up to you. 我将把所有内容包装在<ul/>的困难部分留给您。

如何使用正则表达式检测PHP中的列表或枚举

问题描述

2 个解决方案

解决方案1
3 2014-02-11 16:53:32

解决方案2
3 2014-02-11 16:54:02

如何使用正则表达式检测PHP中的列表或枚举

问题描述

2 个解决方案

解决方案1 3 2014-02-11 16:53:32

解决方案2 3 2014-02-11 16:54:02

解决方案1
3 2014-02-11 16:53:32

解决方案2
3 2014-02-11 16:54:02