簡體   English   中英

Node.js:如何根據 html 字符串中的數據創建特定對象數組?

[英]Node.js: how to create an array of specific objects based on data from html string?

我是 Node.js 的初學者,出於測試目的,我想創建一個簡單的應用程序,根據給定的 HTML 創建一個 object 數組。

讓我解釋一下:我有一個 HTML 字符串,其中包含多個 div 元素,如下所示:

<div class="user_container">
    <div class="user">
        <div class="thumb">
            <!--            thumbnail block-->
        </div>
        <div class="web_presence_locations"></div>

        <div class="user_data">
            <span class="name">Jaroslaw Chujczynski</span>
            <p class="location_with_flag">
                <!--                img with url here-->
                Leeds,
                United Kingdom
            </p>
            <div class="user_details">
                <div class="amount currency">
                    £28,000.00
                    <span class="overbooked">(in overfunding)</span>
                </div>
            </div>
        </div>
    </div>
    <div class="profile_container">
        <div class="extra_profile_data" style="">
            <div class="investments last">
                <h3 class="h5">Recent Investments</h3>
                <ul>
                    <li class="first">
                        <div class="campaign-logo-frame">
                            <a class="campaign_link" href="/test1">test1</a>
                            <span class="currency">£28,000.00</span>
                        </div>
                    </li>
                    <li class="">
                        <div class="campaign-logo-frame">
                            <a class="campaign_link" href="/test2">test2</a>
                            <span class="currency">£28,000.00</span>
                        </div>
                    </li>
                    <li class="">
                        <div class="campaign-logo-frame">
                            <a class="campaign_link" href="/test3">test3</a>
                            <span class="currency">£28,000.00</span>
                        </div>
                    </li>
                    <li class="">
                        <div class="campaign-logo-frame">
                            <a class="campaign_link" href="/test4">test4</a>
                            <span class="currency">£28,000.00</span>
                        </div>
                    </li>
                </ul>
            </div>
        </div>
    </div>
</div>

我想要做的是根據我在上面的 div 中的數據創建一個 object,例如,它將是這樣的:

{
name: 'Jaroslaw Chujczynski',
location: 'Leeds, United Kingdom',
amountCurrency: '£28,000.00 (in overfunding)',
lastInvestments: [
 {
  name: 'test1',
  currency: '£28,000.00'
 }, {
  name: 'test2',
  currency: '£28,000.00'
 }, {
  name: 'test3',
  currency: '£28,000.00'
 }, {
  name: 'test4',
  currency: '£28,000.00'
 }]
}

當然,在我的 html 中會有很多這樣的 div,所以我將創建一個此類對象的數組。

好的,我現在擁有的:

const fs = require('fs');
const cheerio = require('cheerio');

const getAllData = (fileName) => {
    try {
        return  fs.readFileSync(fileName, 'utf8');
    } catch(e) {
        console.log('Error:', e.stack);
    }
}
const data = getAllData('test.html');
const $ = cheerio.load(data);

const filterData = () => {
    console.log($('div[class="user_container"]'));
}

filterData();

它返回給我的是這樣的東西——這是不需要的(或者它必須是這樣的?):

 namespace: 'http://www.w3.org/1999/xhtml',
    attribs: [Object: null prototype] {
      class: 'user_container'
    },
    'x-attribsNamespace': [Object: null prototype] {
      class: undefined
    },
    'x-attribsPrefix': [Object: null prototype] {
      class: undefined
    },
    children: [ [Node], [Node], [Node], [Node], [Node], [Node] ],
    parent: Node {
      type: 'tag',
      name: 'section',
      namespace: 'http://www.w3.org/1999/xhtml',
      attribs: [Object: null prototype],
      'x-attribsNamespace': [Object: null prototype],
      'x-attribsPrefix': [Object: null prototype],
      children: [Array],
      parent: [Node],
      prev: [Node],
      next: [Node]
    },
    etc....

所以我不確定,但我首先必須得到一個 div 塊數組,其中 class 是user_container ,當我得到它時,我必須遍歷這個數組來為每個數組創建 object 。

有人可以幫我弄這個嗎?

html is a type of XML -- you should look at the XML tools -- have that tools parse the html and then you can run XML queries on them with the tool. 這將允許您提取 XML,您可以將其轉換為 JSON。

快速谷歌搜索返回以下用於 nodejs 的 XML 工具——但還有更多:

https://www.npmjs.com/package/fast-xml-parser - 表示它還將導出到 JSON

http://www.curtismlarson.com/blog/2018/10/03/edit-xml-node-js/ - 周四有詳細的介紹。

我至少可以讓你開始:

const data = $('.user_container').get().map(div => {
  return {
    name: $(div).find('.name').text(),
    location: $(div).find('.location_with_flag').text(),
    amountCurrency: $(div).find('.amount.currency').text(),
  }
})

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM