无法在 JavaScript 数组中提取开始和结束 HTML 标签组

Question

我有这个 JavaScript 数组：

let a = [
    [0, "<p><strong>Lorem Ipsum</strong> is simply dummy text of "],
    [1, "<strong>"],
    [0, "the"],
    [1, "</strong>"],
    [0, " printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type "],
    [-1,"and"],
    [1, "test"],
    [0, " scrambled it to make a type"],
    [1, "  added"],
    [0, "</p>"],
    [1, "<ul><li>test</li></ul>"]
];

我正在尝试根据以下条件提取数组组：

以上述数组的一个子数组为例：

[1, "<strong>"],
[0, "the"],
[1, "</strong>"]

这个子数组是一个条件组，条件是a[0] == 1并且a[1]是 HTML 标签的开头。 a[1] 包含<strong> ，它是任何有效 HTML 标签的开头，所以我想推送从开始标签开始到结束标签的元素。

下面是一组：

let group = [
  {
    [1,"<strong>"],
    [0,"the"],
    [1,"</strong>"]
  },
  {
    [1,"<ul><li>test</li></ul>"]
  }
];

我想根据以下条件提取组：

元素的第一个索引是 1，即a[i][0] == 1并且a[i][1]是有效 HTML 标签的开始
元素的第一个索引是 0，即a[i][0] == 0并且它在第 1 步和第 3 步中的规则之前和之后。
元素的第一个索引是 1，即a[i][0] == 1并且a[i][1]是有效 HTML 标记的结尾。

这整个 3 条规则将包含一个组或一个 JavaScript 对象。

也可能有一种情况，例如：

[1,"<ul><li>test</li></ul>"]

数组项包含整个组<ul><li>test</li></ul> 。 这也应该包含在最终结果数组中。

编辑

我已经更新了我的方法

 let a = [ [ 0, "<p><strong>Lorem Ipsum</strong> is simply dummy text of " ], [ 1, "<strong>" ], [ 0, "the" ], [ 1, "</strong>" ], [ 0, " printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type " ], [-1, "and" ], [ 1, "test" ], [ 0, " scrambled it to make a type" ], [ 1, " added" ], [ 0, "</p>" ], [ 1, "<ul><li>test</li></ul>" ] ]; checkAndRemoveGroups(a, 1); function checkAndRemoveGroups(arr, group) { let htmlOpenRegex = /<([\\w \\d \\s]+)([^<]+)([^<]+) *[^/?]>/g; let groupArray = new Array(); let depth = 0; //Iterate the array to find out groups and push the items for (let i = 0; i < arr.length; i++) { if (arr[i][0] == group && arr[i][1].match(htmlOpenRegex)) { depth += 1; groupArray.push({ Index: i, Value: arr[i], TagType: "Open" }); } } console.log(groupArray); }

Answer 1

您可以使用数组来打开和关闭标签，如果需要更多标签来关闭顶部标签，请检查它的长度。

 function getTags(string) { var regex = /<(\\/?[^>]+)>/g, m, result = []; while ((m = regex.exec(string)) !== null) { // This is necessary to avoid infinite loops with zero-width matches if (m.index === regex.lastIndex) { regex.lastIndex++; } result.push(m[1]) } return result; } var array = [[0, "<p><strong>Lorem Ipsum</strong> is simply dummy text of "], [1, "<strong>"], [0, "the"], [1, "</strong>"], [0, " printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type "], [-1, "and"], [1, "test"], [0, " scrambled it to make a type"], [1, " added"], [0, "</p>"], [1, "<ul><li>test</li></ul>"]], result = [], nested = [], tags, i = 0; while (i < array.length) { if (array[i][0] === 1) { tags = getTags(array[i][1]); if (!tags.length) { i++; continue; } result.push([]); // new group found while (i < array.length) { tags.forEach(function (t) { if (t.startsWith('/')) { if (nested[nested.length - 1] === t.slice(1)) { nested.length--; } return; } nested.push(t); }); result[result.length - 1].push(array[i]); if (!nested.length) { break; } i++; tags = getTags(array[i][1]); } } i++; } console.log(result);

 .as-console-wrapper { max-height: 100% !important; top: 0; }

Answer 2

我和斯科特在一起......我认为必须有更好的方法来做你想做的事。 我知道您正试图从这个数组中取出一些东西，但可能有一种完全不同的方法来解决这个问题，即您没有将 html 嵌套在子数组中。

- 已编辑 - 我误解了您要查找的内容，因此我的原始回复实际上并未向您显示出了什么问题，因此我将其删除。 再看看这个。

这正是您想要收到的吗？ 如果您根据 html 正则表达式检查每个元素，我不知道您将如何获得[0,"the"] 。 每个元素都将在其自己的对象中，这似乎不是您想要的。

let group = [
  {
    [1,"<strong>"],
    [0,"the"],
    [1,"</strong>"]
  },
  {
    [1,"<ul><li>test</li></ul>"]
  }
];

无法在 JavaScript 数组中提取开始和结束 HTML 标签组

问题描述

编辑

2 个解决方案

解决方案1
1 2018-03-03 18:42:44

解决方案2
0 2018-03-03 17:42:30

无法在 JavaScript 数组中提取开始和结束 HTML 标签组

问题描述

编辑

2 个解决方案

解决方案1 1 2018-03-03 18:42:44

解决方案2 0 2018-03-03 17:42:30

解决方案1
1 2018-03-03 18:42:44

解决方案2
0 2018-03-03 17:42:30