Node.js和MongoDB（如果存在文档完全匹配），请忽略插入

Question

I am maintaining a collection of unique values that has a companion collection that has instances of those values. 我正在维护一个唯一值的集合，该唯一值具有一个伴随集合，其中包含那些值的实例。 The reason I have it that way is the companion collection has >10 million records where the unique values collection only add up to 100K and I use those values all over the place and do partial match lookups. 之所以这样，是因为伴随集合具有超过1000万条记录，其中唯一值集合的总和仅为100K，我在各处使用这些值并进行部分匹配查找。

When I upload a csv file it is usually 10k to 500k records at a time that I insert into the companion collection. 当我上传一个csv文件时，通常一次插入10k到500k记录到同伴收藏中。 What is the best way to insert only values that dont already exist into the unique values collection? 将唯一不存在的值插入唯一值集合的最佳方法是什么？

Example: 例：

//Insert large quantities of objects into mongo
    var bulkInsert = [
        {
            name: "Some Name",
            other: "zxy",
            properties: "abc"
        },
        {
            name: "Some Name",
            other: "zxy",
            properties: "abc"
        },
        {
            name: "Other Name",
            other: "zxy",
            properties: "abc"
        }]
 //Need to insert only values that do not already exist in mongo unique values collection   
    var uniqueValues = [
        {
            name:"Some Name"
        },
        {
            name:"Other Name"
        }
    ]

EDIT I tried creating a unique index on the field, but once it finds a duplicate in the Array of documents that I am inserting, it stops the whole process and doesnt proceed to check any values after the break. 编辑我尝试在该字段上创建唯一索引，但是一旦它在我要插入的文档数组中找到重复项，它就会停止整个过程，并且在中断后不会继续检查任何值。

Answer 1

Figured it out. 弄清楚了。 If your doing it from the shell, you need to use Bulk() and create insert jobs like this: 如果您是从外壳执行此操作，则需要使用Bulk（）并创建如下插入作业：

var bulk = db.collection.initializeUnorderedBulkOp();
bulk.insert( { name: "1234567890a"} );
bulk.insert( { name: "1234567890b"} );
bulk.insert( { name: "1234567890"} );
bulk.execute();

and in node, the continueOnError flag works on a straight collection.insert() 在节点中， continueOnError标志适用于直接collection.insert()

collection.insert( [{name:"1234567890a"},{name:"1234567890c"}],{continueOnError:true}, function(err, doc){}

Answer 2

Well, I think the solution here is quite simple if I understand correctly your issue. 好吧，如果我正确理解您的问题，我认为这里的解决方案非常简单。 Since the process is stopped when it finds a duplicated field you should basically check if the value doesn't already exists before to try to add it. 由于当发现重复字段时该过程已停止，因此在尝试添加该值之前，基本上应该检查该值是否不存在。

So, for each element in uniqueValues , make a find/findOne query, if it doesn't return any result then add the element, otherwise don't. 因此，对于uniqueValues每个元素，进行一次find / findOne查询，如果它不返回任何结果，则添加该元素，否则不添加。

Node.js和MongoDB（如果存在文档完全匹配），请忽略插入

问题描述

2 个解决方案

解决方案1
1 已采纳 2014-10-30 23:17:13

解决方案2
0 2014-10-30 22:24:15

Node.js和MongoDB（如果存在文档完全匹配），请忽略插入

问题描述

2 个解决方案

解决方案1 1 已采纳 2014-10-30 23:17:13

解决方案2 0 2014-10-30 22:24:15

解决方案1
1 已采纳 2014-10-30 23:17:13

解决方案2
0 2014-10-30 22:24:15