简体   繁体   English

如何使用pymongo过滤mongo集合中的数据

[英]How to filter data in mongo collection using pymongo

I am using pymongo to query MongoDB and check duplicates in a particular collection.我正在使用 pymongo 查询 MongoDB 并检查特定集合中的重复项。 I have identified the duplicates but I want to add one more filter to the script.我已经确定了重复项,但我想在脚本中再添加一个过滤器。 Please find my script below请在下面找到我的脚本

from pymongo import MongoClient


client = MongoClient ('localhost')
db = client.test

data = db.devices.aggregate([
    {'$group': {'_id':{'UserId':"$userId",'DeviceType':"$deviceType"},
                'count':{"$sum":1}}}, 
    {'$match': {'count' : {"$gt" : 1}}}
])

for _id in data:
    print _id

From the above script, I want to check duplicates only for the data where the DeviceType = "email".从上面的脚本中,我只想检查 DeviceType = "email" 的数据的重复项。 I have tried adding an "and" condition after the match but it didn't work.我曾尝试在比赛后添加“和”条件,但没有奏效。

Could you please let me know how to achieve that?你能告诉我如何实现这一目标吗?

Thanks谢谢

You can do this efficiently by prepending a $match stage to your pipeline to filter the docs so that you're only grouping on the docs where deviceType = "email":您可以通过在管道中添加$match阶段来有效地执行此操作以过滤文档,以便您仅对 deviceType = "email" 的文档进行分组:

data = db.devices.aggregate([
    {'$match': {'deviceType': 'email'}},
    {'$group': {'_id': {'UserId': "$userId", 'DeviceType': "$deviceType"},
                'count': {"$sum": 1}}}, 
    {'$match': {'count': {"$gt": 1}}}
])

I think this is a near duplicate of using $and with $match in mongodb .我认为这几乎是在 mongodb使用 $and 和 $match 的重复。

As in that question, I believe you may simply have a syntax error in your query, you will want something like:就像那个问题一样,我相信您的查询中可能只是语法错误,您会想要这样的东西:

$match: {
    $and: [
        {'count': {"$gt": 1}},
        {'DeviceType': {"$eq": "email"}}
    ]
}

If that doesn't help then please paste what you have tried so far as well as any error message output.如果这没有帮助,那么请粘贴您迄今为止尝试过的内容以及任何错误消息输出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM