NodeJs比較兩個arrays

Question

為了記錄，我是一個相對較新的程序員

我的代碼可以正常工作，但是如果要對許多項目進行排序，它似乎很笨重而且很慢

當然，這個節點應用程序不需要很快，即這個過程可能需要 5 分鍾，這很好，但我很好奇是否有更好的方法來做到這一點......

我有這個節點應用程序，它正在比較兩個數據集......該程序的目標如下

將 csv 文件與在線 api 文件進行比較
確保 csv 文件中的所有名稱都存在於數組中
向屏幕拋出錯誤 (console.log()) 消息而不是完成

現在這是代碼

const fs = require("fs");
const csv = require("csv-parser");
const fetch = require("node-fetch");

const results = [];

fs.createReadStream("./customers.csv")
  .pipe(csv())
  .on("data", (data) => {
    results.push(data);
  })
  .on("end", () => {
    console.log("Getting Customer Data from Waze...");
    fetch("https://gql.waveapps.com/graphql/public", {
      method: "post",
      headers: {
        //prettier-ignore
        'Authorization': "Bearer MyAuth",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        query: `
          query {
            business(id: "MyBusinessId") {
              customers {
                edges {
                  node {
                    id
                    name
                  }
                }
              }
            }
          }
        `,
      }),
    })
      .then((res) => res.json())
      .then(({ data }) => {
        console.log("Filtering Data...");
        // this maps through the csv file
        results.map((csv) => {
          let array = [];
          name = "";
          data.business.customers.edges.map((customer) => {
            // push the results of the expression (true of false) to an array
            array.push(
              customer.node.name.toLowerCase() === csv.name.toLowerCase()
            );
            // push nonexistent name (if there is one) variable so error handling is clear
            if (customer.node.name.toLowerCase() !== csv.name.toLowerCase()) {
              name = csv.name;
            }
          });
          // if all elements in array are false, that means there is no matching name in the data.business.customers.edges array and error will be true, if there is a true field in the name, return false
          const error = !array.some((el) => {
            if (el) {
              return true;
            }
          });

          if (error) {
            return console.log(
              `Name: ${name} not found in Waze customer list, please check your spelling`
            );
          }
          // send http request here
        });
        console.log("Finished Sending Invoices");
      });
  });

客戶.csv文件

"name","domain","expiration-date"
"bob","yahoo.com","7/2/2020"
"suzie","google.com","12/1/2020"

現在 graphql api 返回看起來像這樣的數據......

[
  {
    node: {
      id: 'QnVzaW5lc3M6MzE4NmRmNDQtZDg4Zi00MzgxLTk5ZGEtYTQzMWRmYzhmMDk5O0N1c3RvbWVyOjQ3NTg0Mzc2',
      name: 'NOInvoice'
    }
  },
  {
    node: {
      id: 'QnVzaW5lc3M6MzE4NmRmNDQtZDg4Zi00MzgxLTk5ZGEtYTQzMWRmYzhmMDk5O0N1c3RvbWVyOjQ3NTg0MzU3',
      name: 'Suzie'
    }
  },
  {
    node: {
      id: 'QnVzaW5lc3M6MzE4NmRmNDQtZDg4Zi00MzgxLTk5ZGEtYTQzMWRmYzhmMDk5O0N1c3RvbWVyOjQ3NTgwODkx',
      name: 'Bob'
    }
  }
]

任何幫助將不勝感激

Answer 1

嵌套映射 = O(n*m) 時間復雜度 = 性能不佳

首先創建一個來自 API 的名稱的 hashmap，然后掃描 csv 數組並根據 ZDDA7806A4847EC6AABD2 檢查每個名稱是否存在。

使用 hashmap 是提高嵌套循環性能的常用方法。 結果將更接近 O(n+m) 時間復雜度，性能顯着提高。

  // create hash of valid names from API
  const validNames = data.business.customers.edges.reduce(
    (names, customer) => { 
      names[customer.name] = customer;   /* or = true */
      return names; 
    }, 
    {}
  );

  // see if any of the names in the csv are not valid
  const err = results.reduce((err, csv) => validNames[csv.name] ? err: ++err, 0);
  if (arr > 0) {
    // have invalid names in CSV
  }

  // OR alternatively, find the invalid entries
  const invalid = results.reduce(
    (invalid, csv) => {
      if (!validNames[csv.name]) invalid.push(csv);
      return invalid;
    },
    []
  );

編輯

  // OR shorter version of find the invalid entries
  const invalid = results.filter(csv => !validNames[csv.name]);
  if (invalid.length) {
    // have invalid names in CSV
  }

Answer 2

我認為您使用了很多額外的變量，例如您實際上不需要的array 、 name和error 。 所以這不是性能優化，而是試圖解決代碼的笨拙問題。 我指出您可能會考慮的一些更改。

results.map((csv) => {
   customers_names = data.business.customers.edges.map((edge) => edge.node.name)
   if(!customers_names.some((name) => name === csv.name)) {
     console.log(`Name: ${csv.name} not found in Waze customer list, please check your spelling`)
   }
})

代替：

results.map((csv) => {
  let array = []; <-- (1)
  name = ""; <-- (2)
  data.business.customers.edges.map((customer) => {
    // push the results of the expression (true of false) to an array
    array.push(
      customer.node.name.toLowerCase() === csv.name.toLowerCase()
    );
    // push nonexistent name (if there is one) variable so error handling is clear
    if (customer.node.name.toLowerCase() !== csv.name.toLowerCase()) {
      name = csv.name; <-- (3)
    }
  });
  // if all elements in array are false, that means there is no matching name in the data.business.customers.edges array and error will be true, if there is a true field in the name, return false
  const error = !array.some((el) => {
    if (el) {
      return true;
    }
  }); <-- (4)

  if (error) { <-- (5)
    return console.log(
      `Name: ${name} not found in Waze customer list, please check your spelling`
    );
  }
  // send http request here
});

(1) array保留boolean值，這些值確定是否在數據中找到csv.name （ GraphQL響應）。 該array將在(4)上迭代。 當您實際上可以通過some function 並比較名稱時，您實際上不需要通過迭代兩個不同的 arrays 來執行兩個步驟。

在(2)中您定義了一個變量name ，在(3)中您使用相同的值不斷更新該變量，即csv.name （它不會改變，因為它根本不依賴於customer ）。 所以我會完全刪除那個變量

(5)你只關心日志中的csv.name 。 所以我在較短的版本中正是這樣做的

NodeJs比較兩個arrays

問題描述

2 個解決方案

解決方案1
1 已采納 2020-12-09 23:53:34

解決方案2
1 2020-12-10 00:27:55

NodeJs比較兩個arrays

問題描述

2 個解決方案

解決方案1 1 已采納 2020-12-09 23:53:34

解決方案2 1 2020-12-10 00:27:55

解決方案1
1 已采納 2020-12-09 23:53:34

解決方案2
1 2020-12-10 00:27:55