简体   繁体   English

如何使用 Javascript 在字符串中找到类似 HTML 的标签?

[英]How can I find HTML like tags in a string using Javascript?

I have the following string:我有以下字符串:

var originalStr = "Test example <firstTag>text inside first tag</firstTag>, <secondTag>50</secondTag> end."

What's the best way to identify all tags, the correspondent tag name and their content?识别所有标签、对应标签名称及其内容的最佳方法是什么? This is the kind of result I'm looking for.这就是我正在寻找的结果。


var tagsFound = 
    [ { "tagName": "firstTag",  "value": "text inside first tag" } 
    , { "tagName": "secondTag", "value": "50" } 
    ] 

HTML is very complicated to parse, so the best approach is to use a parser that already exists. HTML 解析非常复杂,所以最好的方法是使用已经存在的解析器。

If you're doing this in a browser, you can use the one built into the browser: DOMParser .如果您在浏览器中执行此操作,则可以使用浏览器中内置的一个: DOMParser

If you're doing this in Node.js, there are several libraries to do it, such as jsdom .如果您在 Node.js 中执行此操作,则有几个库可以执行此操作,例如jsdom It provides an API almost identical to the one in web browsers.它提供了一个 API 几乎与 web 浏览器中的相同。

Here's a jsdom example:这是一个jsdom示例:

const dom = new JSDOM("<!doctype html>" + originalStr);
const doc = dom.window.document;
for (const childElement of doc.body.children) {
    console.log(`${childElement.tagName} - ${childElement.textContent}`);
}

With your string, that would output:使用您的字符串,那将是 output:

FIRSTTAG - text inside first tag
SECONDTAG - 50

You'd write code using the DOM methods provided to create the output you're looking for.您将使用提供的 DOM 方法编写代码来创建您正在寻找的 output。 (Note the tag name normalization above; you may have to use nodeLocation to get the original capitalization if it matters to what you're doing.) (请注意上面的标签名称规范化;如果它对您正在做的事情很重要,您可能必须使用nodeLocation来获取原始大写。)

Depending on complexity of strings you dealing with - the simple regEx solution might work (it works for your string nicely:根据您处理的字符串的复杂性 - 简单的正则表达式解决方案可能有效(它很好地适用于您的字符串:

 var str = 'Test example <firstTag>text inside first tag</firstTag>, <secondTag>50</secondTag> end.'; var tagsFound = []; str.replace(/<([a-zA-Z][a-zA-Z0-9_-]*)\b[^>]*>(.*?)<\/\1>/g, function(m,m1,m2){ // write data to result objcect tagsFound.push({ "tagName": m1, "value": m2 }) // replace with original = do nothing with string return m; }); // Displaying the results for(var i=0;i<tagsFound.length; i++){ console.log(tagsFound[i]); }

There will be a problem when self closing tags or tags containing other tags are taken into accont.当自闭标签或包含其他标签的标签被考虑在内时会出现问题。 Like <selfClosedTag/> or <tag><tag>something</tag>else</tag><selfClosedTag/><tag><tag>something</tag>else</tag>

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 Javascript 创建 HTML 标签? - How can I create HTML tags using Javascript? 我如何在网站中像字符串一样显示html代码和javascript - How can i show html code and javascript like a string in website 如何使用javascript获取字符串中重复的DYNAMIC html标签之间的字符串? (没有正则表达式,除非它是唯一的方法!) - How can I get the string between repeated DYNAMIC html tags in a string using javascript? (No regex unless its the only way!) 如何在javascript中的html字符串中找到包含标签的html子字符串? - How to find html substring including tags in a html string in javascript? 如何使用 jQuery 从字符串中删除输入标签? 该字符串还包含其他 html 标签 - How can I remove input tags from a string using jQuery? This string contains other html tags as well 我如何在 javascript 中使用 html 标签 - How can i use html tags in javascript 如何<button>通过 Javascript 在 Popover Content (Bootstap) 中</button>添加 Html 标签(如<button>)</button> - How can I add Html tags (like <button>) in Popover Content (Bootstap) by Javascript 如何使用javascript / jquery在html字符串中查找和替换begin标签和结束标签 - How to find and replace begin tag and end tags in an html string using javascript/jquery 使用 JavaScript,如何将 HTML 字符串转换为 HTML 标签和文本内容的数组? - Using JavaScript, how do I transform an HTML string into an array of HTML tags and text content? 如何在不影响使用 Nodejs 或 Javascript 的 HTML 标签的情况下从 HTML 中获取 100 到 200 个单词? - How can I get 100 to 200 words from HTML without affecting HTML tags using Nodejs or Javascript?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM