简体   繁体   English

JavaScript 相当于 php DOMDocument Object

[英]JavaScript equivalent of php DOMDocument Object

I wrote a code in PHP for parsing data that I received by an API request from "wikipedia.org".我在 PHP 中编写了一段代码,用于解析我通过来自“wikipedia.org”的 API 请求收到的数据。 I used DOMDocument class to parse the data and it worked perfectly fine.我使用 DOMDocument class 来解析数据并且它工作得很好。 Now I want to do the same job in JavaScript. The API request returns (after a little cleaning up) a string like this:现在我想在 JavaScript 中做同样的工作。API 请求返回(稍微清理后)这样的字符串:

$htmlString = "<ul>
    <li>Item 1</li>
    <li>Item 2</li>
</ul>
<ul>
    <li>Item 3</li>
    <li>Item 4</li>
    <li>Item 5</li>
</ul>"

Note that this is just an example.请注意,这只是一个示例。 Any request might have different number of lists, but it is always a series of unordered lists.任何请求都可能有不同数量的列表,但它始终是一系列无序列表。 I needed to get the text inside the <li> tags and the following PHP code worked perfectly fine.我需要获取<li>标签内的文本,下面的 PHP 代码工作得很好。

$DOM = new DOMDocument;
$DOM->loadHTML($htmlString);
$lis = $DOM->getElementsByTagName('li');
$items =[];
for ($i = 0; $i < $lis->length; $i++) $items[] = $lis[$i]->nodeValue;

And I get the array [Item 1,...,Item 5] inside $items variable as I wanted.我根据需要在$items变量中获取数组 [Item 1,...,Item 5]。 Now I want to do the same job in JavaScript. That is I have a string现在我想在 JavaScript 中做同样的工作。那就是我有一个字符串

htmlString = "<ul>
    <li>Item 1</li>
    <li>Item 2</li>
</ul>
<ul>
    <li>Item 3</li>
    <li>Item 4</li>
    <li>Item 5</li>
</ul>"

in JavaScript and I want to get the text inside each of the <li> tags.在 JavaScript 中,我想获取每个<li>标签内的文本。 I searched the web for an equivalent class to PHP DOMDocument in JavaScript, and surprisingly I found nothing.我在 web 中搜索了 JavaScript 中 class 到 PHP DOMDocument 的等效项,令人惊讶的是我什么也没找到。 Any ideas how to do this in (preferably Vanilla) JavaScript similar to the PHP code?任何想法如何在(最好是香草)JavaScript 中执行此操作类似于 PHP 代码? If not, any idea how to do this anyway in JavaScript (even maybe with regular expressions)?如果没有,知道如何在 JavaScript 中执行此操作(甚至可能使用正则表达式)吗?

Use DOMParser()使用DOMParser()

Your ported code, which is very much the same as your PHP:您移植的代码,与您的 PHP 非常相似:

 let parser = new DOMParser() let doc = parser.parseFromString(`<ul> <li>Item 1</li> <li>Item 2</li> </ul> <ul> <li>Item 3</li> <li>Item 4</li> <li>Item 5</li> </ul>`, "text/html") let lis = doc.getElementsByTagName('li') let items = [] for (let i = 0; i < lis.length; i++) items.push(lis[i].textContent) console.log(items)

If you're working strictly with strings, you want to use Regular Expressions.如果您严格使用字符串,则需要使用正则表达式。

FYI I'm using ES20xx syntax.仅供参考,我使用的是 ES20xx 语法。 If you can't support this, you'll need to convert to the syntax you're users can access.如果您不支持这一点,则需要转换为用户可以访问的语法。

Here I have an expressions that captures whatever is in between opening <ul> or <li> and the closing tags.在这里,我有一个表达式可以捕获开始<ul><li>和结束标记之间的任何内容。 Then I use the line breaks to split the string into an array.然后我使用换行符将字符串拆分成一个数组。 We need to filter out empty elements from the resulting array and finally return the desired items in a final array.我们需要从结果数组中过滤掉空元素,并最终在最终数组中返回所需的项目。

 var htmlString = `<ul> <li>Item 1</li> <li>Item 2</li> </ul> <ul> <li>Item 3</li> <li>Item 4</li> <li>Item 5</li> </ul>`; var lis = htmlString.replace(/<ul>|<li>(.*)<\/li>|<\/ul>/g, '$1').split('\n'); var items = lis.filter(item => { if (item && item;== null && item.== '') { return item. } }),map(item => { var element = item,replace(/\s{2;}/g; ''); return element. }). console,log('items array;', items);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM