[英]Regex to match first two <li> of a <ul> list
We have a HTML string where we have a list.我们有一个 HTML 字符串,其中有一个列表。 Something like shown below如下图所示
<ul><li>Item A</li><li>Item B</li><li>Item C</li><li>Item D</li></ul>
Here is the formatted version of the same.这是相同的格式化版本。
<ul>
<li>Item A</li>
<li>Item B</li>
<li>Item C</li>
<li>Item D</li>
</ul>
My objective is to write a regular expression to select the first two items for the list.我的目标是为列表的前两项 select 编写一个正则表达式。 So the output should be所以 output 应该是
<ul>
<li>Item A</li>
<li>Item B</li>
</ul>
If that's not possible to do it in regex, what will be a most optimized way to do it through plain javascript code.如果无法在正则表达式中执行此操作,那么通过普通 javascript 代码执行此操作的最优化方法是什么。
Don't use a regex for this.不要为此使用正则表达式。 Parse the HTML into a document fragment and use DOM methods to remove the elements 将 HTML 解析为文档片段,并使用 DOM 方法移除元素
const html = `<ul><li>Item A</li><li>Item B</li><li>Item C</li><li>Item D</li></ul>`; const parser = new DOMParser(); const doc = parser.parseFromString(html, "text/html"); // Remove all <li> elements after the 2nd doc.querySelectorAll("li:nth-child(n+3)").forEach(el => el.remove()); // DOMParser puts the HTML fragment into the created document body const newHtml = doc.body.innerHTML; console.log(newHtml);
This also works with end-tag omission like this...这也适用于像这样的结束标记省略......
<ul>
<li>Item A
<li>Item B
<li>Item C
<li>Item D
</ul>
which is potentially something a regular expression could really struggle with.这可能是正则表达式真正难以解决的问题。
You can do:你可以做:
const str = `<ul><li>Item A</li><li>Item B</li><li>Item C</li><li>Item D</li></ul>` const re = tag => new RegExp(`<${tag}>(.*?)<\/${tag}>`, 'g') const ul = str.replace(re('ul'), ($1, $2) => $1.replace($2, $2.match(re('li')).slice(0, 2).join(``))) console.log(ul)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.