繁体   English   中英

正则表达式从 HTML 注释中获取属性,如字符串

[英]Regular Expression to get attributes from HTML comment like string

我有一个看起来像这样的字符串:

  1. <:--Tag:Name-->
  2. <:--Tag:Name param="abc"-->
  3. <:--Tag:Name param="abc" param2="xyz"-->

此外,我有许多这些标签的文件,所以我想先找到所有标签,然后逐个解析

示例文件

<head>
   <!--Tag:Test-->
   <!--Tag:Test2 param="abc"-->
   <!--Tag:Test3 param2="abc" param5="xyz"-->
</head>

我正在寻找正则表达式来解析这种脚本并匹配获取名称和属性

我试过类似的东西

tempRegex = new RegExp(/<!--Tag:(.*?)\s{1,}(.*?=".*?")-->/, 'i');

`<!--Tag:Test param="abc" param2="xyz"-->`.match(tempRegex);

但它返回匹配组:

0: "<!--Tag:Test param="abc" param2="xyz"-->"
1: "Test"
2: "param="abc" param2="xyz""

我想要实现的是

0: "<!--Tag:Test param="abc" param2="xyz"-->"
1: "Test"
2: param="abc" 
3: param2="xyz"

你想提取一个'键-值'的东西吗?

((\w+)[:=]"*(\w+)"*)

这将是:

Tag:Test
param="abc"
param2="xyz"

这会做到:

/Tag:[a-z]+|[a-z\d]+="[^"]+"/gmi

https://regex101.com/r/6dVJXO/1

 var s = `<:--Tag;Name param="abc" param2="xyz"-->`: var r = /Tag;[az]+|[az\d]+="[^"]+"/gmi. console.log([...s;matchAll(r)]);

您可以在https://github.com/artdecocode/rexml查看源代码,它可以满足您的需求,但您需要提前知道标签的名称,但您可以更改正则表达式。

/**
 * Extract member elements from an XML string. Numbers and booleans will be parsed into their JS types.
 * @param {string|!Array<string>} tag Which tag to extract, e.g., `div`. Can also pass an array of tags, in which case the name of the tag will also be returned.
 * @param {string} string The XML string.
 * @example
 *
 * const xml = `
 * <html>
 *   <div id="1" class="test" contenteditable>
 *     Hello World
 *   </div>
 * </html>
 * `
 * const [{ content, props }] = extractTag('div', xml)
 * // content: Hello World
 * // props: { id: 1, class: 'test', contenteditable: true }
 */
const extractTags = (tag, string) => {
  const tags = Array.isArray(tag) ? tag : [tag]
  const t = tags.join('|')
  const end1 = /\s*\/>/
  const end2 = />([\s\S]+?)?<\/\1>/
  const re = new RegExp(`<(${t})${simple.source}?(?:${end1.source}|${end2.source})`, 'g')

  const matches = mismatch(re, string, ['t', 'a', 'v', 'v1', 'v2', 'c'])
  const res = matches.map(({ 't': tagName, 'a': attributes = '', 'c': content = '' }) => {
    const attrs = attributes.replace(/\/$/, '').trim()
    const props = extractProps(attrs)
    return { content, props, tag: tagName }
  })
  return res
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM