简体   繁体   English

MongoDB上的C#查询未返回正确结果

[英]C# Query On MongoDB Not Returning Correct Results

Im currently running into an issue when querying MongoDb using c#. 我目前在使用c#查询MongoDb时遇到问题。 The problem is that I am not returned the correct results or the correct number of results. 问题是我没有返回正确的结果或正确的结果数。 I do not know the exact number of results but it should be less than 100; 我不知道确切的结果数,但是应该少于100; instead, I am receiving around 350k-500k results (many of which are null). 相反,我收到了大约350k-500k的结果(其中许多为空)。 The other problem is that the program takes upwards of 10 minutes to finish processing. 另一个问题是程序需要10分钟以上的时间才能完成处理。

You can see the problematic portion of code in the following: 您可以在以下代码中查看有问题的部分:

public List<BsonDocument> find_All_Documents_With_pIDs()
    {            
        List<string> databases = new List<string>();
        List<BsonDocument> pDocs = new List<BsonDocument>(); 
        databases.AddRange(mongo_Server.GetDatabaseNames());

        //iterate through each db in mongo
        foreach (string dbName in databases)
        {
            List<string> collections = new List<string>();
            var database = mongo_Server.GetDatabase(dbName);
            collections.AddRange(database.GetCollectionNames());

            //iterate through each collection
            foreach (string colName in collections)
            {
                var collection = database.GetCollection(colName);

                //Iterate through each document
                foreach (var document in collection.FindAllAs<BsonDocument>())
                {   
                    //Get all documents that have a pID in either the main document or its sub document                     
                    IMongoQuery query = Query.Exists(document.GetElement("_id").ToString().Remove(0,4) + ".pID");
                    IMongoQuery subQuery = Query.Exists(document.GetElement("_id").ToString() + ".SubDocument.pID");
                    pDocs.AddRange(collection.Find(query));
                    pDocs.AddRange(collection.Find(subQuery));
                }
            }
        }

        //Theres a collection used earlier in the program to backup the documents before processing. Not removing the documents from the list found in this location will result in duplicates. 
        return remove_Backup_Documents_From_List(pIDs);
    }

Any help is appreciated! 任何帮助表示赞赏!

EDIT: 编辑:

The following is a screen capture of the data received. 以下是接收到的数据的屏幕截图。 Not all the data is null like the following but a very large amount is: 并非所有数据都为空,如下所示,但很大一部分是:

在此处输入图片说明

Your script is first bringing all your documents from the database 您的脚本首先要从数据库中获取所有文档

collection.FindAllAs<BsonDocument>()

and then assembling a query for each one. 然后为每个查询组合一个查询。 That's probably the reason the query is so slow. 这可能是查询如此缓慢的原因。

As an alternative you could do the following: 或者,您可以执行以下操作:

    foreach (string colName in collections)
{
    var collection = database.GetCollection(colName);

    //Query for all documents that have pID
    IMongoQuery query = Query.And([Query.Exists("pID"), // The field exists
      Query.NE("pID", BsonNull.Value), //It is not "null"
      Query.NE("pID", BsonString.Null)]); //It is not empty i.e. = ""

    //Query for all documents that have Subdocument.pID
    IMongoQuery subQuery = Query.And([Query.Exists("SubDocument.pID"), // The field exists
      Query.NE("SubDocument.pID", BsonNull.Value), //It is not "null"
      Query.NE("SubDocument.pID", BsonString.Null)]); //It is not empty i.e. = ""


    IMongoQuery totalQuery = Query.Or([query, subQuery]);


    List<BsonDocument> results = collection.Find(totalQuery);
    if (results.Count > 0) {
      pDocs.AddRange(results); //Only add to pDocs if query returned at least one result
    }
}

That way you assemble a query that returns only the documents that have either pID or Subdocument.pID fields set. 这样,您将汇编一个查询,该查询仅返回设置了pIDSubdocument.pID字段的Subdocument.pID

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM