简体   繁体   English

Node.js和MongoDB(如果存在文档完全匹配),请忽略插入

[英]Node.js and MongoDB if document exact match exists, ignore insert

I am maintaining a collection of unique values that has a companion collection that has instances of those values. 我正在维护一个唯一值的集合,该唯一值具有一个伴随集合,其中包含那些值的实例。 The reason I have it that way is the companion collection has >10 million records where the unique values collection only add up to 100K and I use those values all over the place and do partial match lookups. 之所以这样,是因为伴随集合具有超过1000万条记录,其中唯一值集合的总和仅为100K,我在各处使用这些值并进行部分匹配查找。

When I upload a csv file it is usually 10k to 500k records at a time that I insert into the companion collection. 当我上传一个csv文件时,通常一次插入10k到500k记录到同伴收藏中。 What is the best way to insert only values that dont already exist into the unique values collection? 将唯一不存在的值插入唯一值集合的最佳方法是什么?

Example: 例:

//Insert large quantities of objects into mongo
    var bulkInsert = [
        {
            name: "Some Name",
            other: "zxy",
            properties: "abc"
        },
        {
            name: "Some Name",
            other: "zxy",
            properties: "abc"
        },
        {
            name: "Other Name",
            other: "zxy",
            properties: "abc"
        }]
 //Need to insert only values that do not already exist in mongo unique values collection   
    var uniqueValues = [
        {
            name:"Some Name"
        },
        {
            name:"Other Name"
        }
    ]

EDIT I tried creating a unique index on the field, but once it finds a duplicate in the Array of documents that I am inserting, it stops the whole process and doesnt proceed to check any values after the break. 编辑我尝试在该字段上创建唯一索引,但是一旦它在我要插入的文档数组中找到重复项,它就会停止整个过程,并且在中断后不会继续检查任何值。

Figured it out. 弄清楚了。 If your doing it from the shell, you need to use Bulk() and create insert jobs like this: 如果您是从外壳执行此操作,则需要使用Bulk()并创建如下插入作业:

var bulk = db.collection.initializeUnorderedBulkOp();
bulk.insert( { name: "1234567890a"} );
bulk.insert( { name: "1234567890b"} );
bulk.insert( { name: "1234567890"} );
bulk.execute();

and in node, the continueOnError flag works on a straight collection.insert() 在节点中, continueOnError标志适用于直接collection.insert()

collection.insert( [{name:"1234567890a"},{name:"1234567890c"}],{continueOnError:true}, function(err, doc){}

Well, I think the solution here is quite simple if I understand correctly your issue. 好吧,如果我正确理解您的问题,我认为这里的解决方案非常简单。 Since the process is stopped when it finds a duplicated field you should basically check if the value doesn't already exists before to try to add it. 由于当发现重复字段时该过程已停止,因此在尝试添加该值之前,基本上应该检查该值是否不存在。

So, for each element in uniqueValues , make a find/findOne query, if it doesn't return any result then add the element, otherwise don't. 因此,对于uniqueValues每个元素,进行一次find / findOne查询,如果它不返回任何结果,则添加该元素,否则不添加。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM