[英]Querying large Mongodb collection using pymongo
I want to query my mongodb collection which has more than 5k records, each record has key-value pair like我想查询超过 5k 条记录的 mongodb 集合,每条记录都有键值对,例如
{
"A" : "unique-value1",
"B" : "service1",
"C" : 1.2321,
...
},
...
here A
will always have unique
value, B
has value like service1, service2, ....service8
and C is some float value.这里
A
将始终具有unique
值, B
具有诸如service1, service2, ....service8
类的值,而 C 是一些浮点值。
what I want is to get a record like this with key-value pair.我想要的是使用键值对获得这样的记录。
{
"A" : "unique-value1",
"B" : "service1",
"C" : 1.2321
}
{
"A" : "unique-value2",
"B" : "service2",
"C" : 0.2321
}
{
"A" : "unique-value3",
"B" : "service1",
"C" : 3.2321
}
I am not sure how to do this, earlier I used MapReduce but that time I was needed to generate records with A
and C
key value paire only but now since i also need B
i do not know what should i do.我不知道如何做到这一点,之前我使用了 MapReduce,但那时我只需要使用
A
和C
键值对生成记录,但现在因为我也需要B
我不知道我该怎么做。
this is what i was doing这就是我在做的
map_reduce = Code("""
function () {
emit(this.A, parseFloat(this.C));
}
""")
result = my_collection.map_reduce(map_reduce, reduce, out='temp_collection')
for doc in result.find({}):
out = dict()
out[doc['_id']] = doc['_id']
out['cost'] = doc['value']
out_handle.update_one(
{'A': doc['_id']},
{'$set': out},
upsert=True
)
Unless I've misunderstood what you need , it looks like you are making this harder than it need be.除非我误解了您的需要,否则您似乎使这变得比需要的更难。 Just project the keys you want using the second parameter of the find method.
只需使用 find 方法的第二个参数投影您想要的键。
for record in db.testcollection.find({}, { 'A': 1, 'B': 1, 'C': 1}):
db.existingornewcollection.replace_one({'_id': record['_id']}, record, upsert=True)
Full example:完整示例:
from pymongo import MongoClient
from bson.json_util import dumps
db = MongoClient()['testdatabase']
db.testcollection.insert_one({
"A": "unique-value1",
"B": "service1",
"C": 1.2321,
"D": "D",
"E": "E",
"F": "F",
})
for record in db.testcollection.find({}, { 'A': 1, 'B': 1, 'C': 1}):
db.existingornewcollection.replace_one({'_id': record['_id']}, record, upsert=True)
print(dumps(db.existingornewcollection.find_one({}, {'_id': 0}), indent=4))
gives:给出:
{
"A": "unique-value1",
"B": "service1",
"C": 1.2321
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.