简体   繁体   English

使用 pymongo 查询大型 Mongodb 集合

[英]Querying large Mongodb collection using pymongo

I want to query my mongodb collection which has more than 5k records, each record has key-value pair like我想查询超过 5k 条记录的 mongodb 集合,每条记录都有键值对,例如

{
 "A" : "unique-value1",
 "B" : "service1",
 "C" : 1.2321,
 ...
},
...

here A will always have unique value, B has value like service1, service2, ....service8 and C is some float value.这里A将始终具有unique值, B具有诸如service1, service2, ....service8类的值,而 C 是一些浮点值。

what I want is to get a record like this with key-value pair.我想要的是使用键值对获得这样的记录。

{
 "A" : "unique-value1",
 "B" : "service1",
 "C" : 1.2321
}

{
  "A" : "unique-value2",
  "B" : "service2",
  "C" : 0.2321
}
{
  "A" : "unique-value3",
  "B" : "service1",
  "C" : 3.2321
}

I am not sure how to do this, earlier I used MapReduce but that time I was needed to generate records with A and C key value paire only but now since i also need B i do not know what should i do.我不知道如何做到这一点,之前我使用了 MapReduce,但那时我只需要使用AC键值对生成记录,但现在因为我也需要B我不知道我该怎么做。

this is what i was doing这就是我在做的

map_reduce = Code("""
        function () {
            emit(this.A, parseFloat(this.C));
        }
        """)
result = my_collection.map_reduce(map_reduce, reduce, out='temp_collection')

for doc in result.find({}):
    out = dict()
    out[doc['_id']] = doc['_id']
    out['cost'] = doc['value']
    out_handle.update_one(
        {'A': doc['_id']},
        {'$set': out},
        upsert=True
        )

Unless I've misunderstood what you need , it looks like you are making this harder than it need be.除非我误解了您的需要,否则您似乎使这变得比需要的更难。 Just project the keys you want using the second parameter of the find method.只需使用 find 方法的第二个参数投影您想要的键。

for record in db.testcollection.find({}, { 'A': 1, 'B': 1, 'C': 1}):
    db.existingornewcollection.replace_one({'_id': record['_id']}, record, upsert=True)

Full example:完整示例:

from pymongo import MongoClient
from bson.json_util import dumps
db = MongoClient()['testdatabase']

db.testcollection.insert_one({
    "A": "unique-value1",
    "B": "service1",
    "C": 1.2321,
    "D": "D",
    "E": "E",
    "F": "F",
})

for record in db.testcollection.find({}, { 'A': 1, 'B': 1, 'C': 1}):
    db.existingornewcollection.replace_one({'_id': record['_id']}, record, upsert=True)

print(dumps(db.existingornewcollection.find_one({}, {'_id': 0}), indent=4))

gives:给出:

{
    "A": "unique-value1",
    "B": "service1",
    "C": 1.2321
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM