I am trying to list all the virtual machines (vms) in my mongo database that use a certain data store, EMC_123. I have this script, but it list vms that do not use the data store EMC_123.
#!/usr/bin/env python
import pprint
import pymongo
def run_query():
server = '127.0.0.1'
client = pymongo.MongoClient("mongodb://%s:27017/" % server)
db = client["data_center_test"]
collection = db["data_centers"]
pipeline = [
{ "$match": { "clusters.hosts.vms.data_stores.name" : "EMC_123"}},
{ "$group": { "_id" : "$clusters.hosts.vms.name" }}
]
for doc in list(db.data_centers.aggregate(pipeline)):
pp = pprint.PrettyPrinter()
pp.pprint(doc)
pp.pprint (db.command('aggregate', 'data_centers', pipeline=pipeline, explain=True))
def main():
run_query()
return 0
# Start program
if __name__ == "__main__":
main()
I assume I there is something wrong with my pipeline. Here is the plan that gets printed out:
{u'ok': 1.0,
u'stages': [{u'$cursor': {u'fields': {u'_id': 0,
u'clusters.hosts.vms.name': 1},
u'query': {u'clusters.hosts.vms.data_stores.name': u'EMC_123'},
u'queryPlanner': {u'indexFilterSet': False,
u'namespace': u'data_center_test.data_centers',
u'parsedQuery': {u'clusters.hosts.vms.data_stores.name': {u'$eq': u'EMC_123'}},
u'plannerVersion': 1,
u'rejectedPlans': [],
u'winningPlan': {u'direction': u'forward',
u'filter': {u'clusters.hosts.vms.data_stores.name': {u'$eq': u'EMC_123'}},
u'stage': u'COLLSCAN'}}}},
{u'$group': {u'_id': u'$clusters.hosts.vms.name'}}]}
UPDATE:
Here is a skeleton of what the document looks like:
{
"name" : "data_center_name",
"clusters" : [
{
"hosts" : [
{
"name" : "esxi-hostname",
"vms" : [
{
"data_stores" : [ { "name" : "EMC_123" } ],
"name" : "vm-name1",
"networks" : [ { "name" : "vlan334" } ]
},
{
"data_stores" : [ { "name" : "some_other_data_store" } ],
"name" : "vm-name2",
"networks" : [ { "name" : "vlan334" } ]
}
]
}
],
"name" : "cluster_name"
}
]
}
The problem I am seeing is that vm-name2
shows up in the results when it doesn't have EMC_123 as a data store.
Upate 2:
ok I am able to write a mongo shell query that does what I want. It is a little ugly:
db.data_centers.aggregate({$unwind: '$clusters'}, {$unwind: '$clusters.hosts'}, {$unwind: '$clusters.hosts.vms'}, {$unwind: '$clusters.hosts.vms.data_stores'}, {$match: {"clusters.hosts.vms.data_stores.name": "EMC_123"}})
I came about this in the second answer of this SO question: MongoDB Projection of Nested Arrays
Based on the answers in MongoDB Projection of Nested Arrays I had to change my pipeline
to this:
pipeline = [
{'$unwind': '$clusters'},
{'$unwind': '$clusters.hosts'},
{'$unwind': '$clusters.hosts.vms'},
{'$unwind': '$clusters.hosts.vms.data_stores'},
{'$match': {"clusters.hosts.vms.data_stores.name": "EMC_123"}}
]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.