简体   繁体   English

将 csv 转换为 json 并删除重复项,同时使用 nodejs 将数据保存到 mongodb

[英]convert csv to json and remove duplicates while saving the data to mongodb using nodejs

//products.csv | //products.csv | uniqueCode |唯一码 | name |姓名 | |:---------- | |:--------- | --------:| --------:| | | 0001 | 0001 | mouse |鼠标 | | | 0002 | 0002 | keyboard |键盘 | | | 0003 | 0003 | monitor |显示器 | | | 0001 | 0001 | mouse |鼠标 |

//code to convert csv to json and to save the data to mongodb //将 csv 转换为 json 并将数据保存到 mongodb 的代码

 const csv = require("csvtojson"); router.post("/uploadProducts", async (req, res) => { const products = await csv().fromFile("./products.csv"); try { products.map(async (pdata) => { let uniqueCode= await Product.findOne({ product_code: pdata.productCode, }); if (:uniqueCode) { //create new object let product = new Product({ product_code. pdata,productCode: product_name. pdata,name; }). await product;save(); } }). res;send("success"). } catch (err) { console;error(err); } })

in the above code I check if the unique code already exists in the database.在上面的代码中,我检查数据库中是否已经存在唯一代码。 If not then create new object else ignore that object.如果没有,则创建新的 object 否则忽略 object。 so basically I'm trying to remove duplicates while saving the data.所以基本上我试图在保存数据的同时删除重复项。

But the problem here is that the duplicates are also getting saved in the db.但这里的问题是重复项也保存在数据库中。 In the products.csv file 1st row has unique code of 0001 and same unique code in the last row.在 products.csv 文件中,第一行的唯一代码为 0001,最后一行的唯一代码相同。 So while mapping the objects the last object should have been ignored but it's not, it is getting saved anyways.因此,在映射对象时,最后一个 object 应该被忽略,但事实并非如此,无论如何它都会被保存。

//once mapping is done then only the data is getting saved..All at once. //一旦映射完成,那么只有数据被保存..一次。

// creating local array and comparing the objects works. // 创建本地数组并比较对象有效。 But I want a solution which can work directly with the mongodb.但我想要一个可以直接与 mongodb 一起工作的解决方案。

can anyone help me with this?谁能帮我这个?

There are two problems:有两个问题:

  1. product.save() is asynchronous as well but not waited for with await . product.save()也是异步的,但不等待await This way, if you have a product with the same id in two subsequent lines, it is likely that while processing the second line and checking the database with Product.findOne , the insertion of the product of the previous line is not yet done and therefore the guard-check fails and you insert another item.这样,如果您在随后的两行中有一个具有相同 id 的产品,则很可能在处理第二行并使用Product.findOne检查数据库时,上一行的产品的插入尚未完成,因此警卫检查失败,您插入另一个项目。

  2. You supply an asynchronous function to map .您向 map 提供异步map Every asynchronous function returns a Promise.每个异步 function 返回一个 Promise。 This way, your map is done probably even before any call to MongoDB reached the database and converted your array of products to an array of promises that run asynchronous .这样,您的map甚至可能在对 MongoDB 的任何调用到达数据库并将您的产品数组转换为运行异步的承诺数组之前就完成了。 Therefore, you have no influence on the order of execution of the calls to the database and cannot expect them to run sequentially.因此,您对数据库调用的执行顺序没有影响,也不能期望它们按顺序运行。 This means, that most probably for all your products the findOne function is already called before you any call to product.save() was made at all and therefore returns false for all products with the same id if they have not been in the DB before the call of your procedure.这意味着,对于您的所有产品,最有可能的是findOne function 在您对product.save()进行任何调用之前就已经被调用,因此如果它们之前没有在数据库中,则对于具有相同 id 的所有产品返回 false你的程序的调用。 This also implies that you send success to the client a long time before your whole database operation is done.这也意味着您在完成整个数据库操作之前很长时间向客户端发送了成功

Solution: Get rid of duplicates in your CSV file before sending them to your database.解决方案:删除 CSV 文件中的重复项,然后再将它们发送到数据库。 It will also make your program more efficient.它还将使您的程序更有效率。 Then use your function on this list and everything will work as expected.然后在此列表中使用您的 function,一切都会按预期工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM