简体   繁体   English


[英]How to filter data in mongo collection using pymongo

I am using pymongo to query MongoDB and check duplicates in a particular collection.我正在使用 pymongo 查询 MongoDB 并检查特定集合中的重复项。 I have identified the duplicates but I want to add one more filter to the script.我已经确定了重复项,但我想在脚本中再添加一个过滤器。 Please find my script below请在下面找到我的脚本

from pymongo import MongoClient

client = MongoClient ('localhost')
db = client.test

data = db.devices.aggregate([
    {'$group': {'_id':{'UserId':"$userId",'DeviceType':"$deviceType"},
    {'$match': {'count' : {"$gt" : 1}}}

for _id in data:
    print _id

From the above script, I want to check duplicates only for the data where the DeviceType = "email".从上面的脚本中,我只想检查 DeviceType = "email" 的数据的重复项。 I have tried adding an "and" condition after the match but it didn't work.我曾尝试在比赛后添加“和”条件,但没有奏效。

Could you please let me know how to achieve that?你能告诉我如何实现这一目标吗?


You can do this efficiently by prepending a $match stage to your pipeline to filter the docs so that you're only grouping on the docs where deviceType = "email":您可以通过在管道中添加$match阶段来有效地执行此操作以过滤文档,以便您仅对 deviceType = "email" 的文档进行分组:

data = db.devices.aggregate([
    {'$match': {'deviceType': 'email'}},
    {'$group': {'_id': {'UserId': "$userId", 'DeviceType': "$deviceType"},
                'count': {"$sum": 1}}}, 
    {'$match': {'count': {"$gt": 1}}}

I think this is a near duplicate of using $and with $match in mongodb .我认为这几乎是在 mongodb使用 $and 和 $match 的重复。

As in that question, I believe you may simply have a syntax error in your query, you will want something like:就像那个问题一样,我相信您的查询中可能只是语法错误,您会想要这样的东西:

$match: {
    $and: [
        {'count': {"$gt": 1}},
        {'DeviceType': {"$eq": "email"}}

If that doesn't help then please paste what you have tried so far as well as any error message output.如果这没有帮助,那么请粘贴您迄今为止尝试过的内容以及任何错误消息输出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM