简体   繁体   English

正则表达式匹配前两个<li>的<ul>列表</ul></li>

[英]Regex to match first two <li> of a <ul> list

We have a HTML string where we have a list.我们有一个 HTML 字符串,其中有一个列表。 Something like shown below如下图所示

<ul><li>Item A</li><li>Item B</li><li>Item C</li><li>Item D</li></ul>

Here is the formatted version of the same.这是相同的格式化版本。

<ul>
   <li>Item A</li>
   <li>Item B</li>
   <li>Item C</li>
   <li>Item D</li>
</ul>

My objective is to write a regular expression to select the first two items for the list.我的目标是为列表的前两项 select 编写一个正则表达式。 So the output should be所以 output 应该是

<ul>
   <li>Item A</li>
   <li>Item B</li>
</ul>

If that's not possible to do it in regex, what will be a most optimized way to do it through plain javascript code.如果无法在正则表达式中执行此操作,那么通过普通 javascript 代码执行此操作的最优化方法是什么。

Don't use a regex for this.不要为此使用正则表达式。 Parse the HTML into a document fragment and use DOM methods to remove the elements 将 HTML 解析为文档片段,并使用 DOM 方法移除元素

 const html = `<ul><li>Item A</li><li>Item B</li><li>Item C</li><li>Item D</li></ul>`; const parser = new DOMParser(); const doc = parser.parseFromString(html, "text/html"); // Remove all <li> elements after the 2nd doc.querySelectorAll("li:nth-child(n+3)").forEach(el => el.remove()); // DOMParser puts the HTML fragment into the created document body const newHtml = doc.body.innerHTML; console.log(newHtml);

This also works with end-tag omission like this...这也适用于像这样的结束标记省略......

<ul>
   <li>Item A
   <li>Item B
   <li>Item C
   <li>Item D
</ul>

which is potentially something a regular expression could really struggle with.这可能是正则表达式真正难以解决的问题。

You can do:你可以做:

 const str = `<ul><li>Item A</li><li>Item B</li><li>Item C</li><li>Item D</li></ul>` const re = tag => new RegExp(`<${tag}>(.*?)<\/${tag}>`, 'g') const ul = str.replace(re('ul'), ($1, $2) => $1.replace($2, $2.match(re('li')).slice(0, 2).join(``))) console.log(ul)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM