简体   繁体   English

通过聚合查找副本,并使用MongoDB和Golang查找

[英]Find duplicates with aggregation and lookup with MongoDB and Golang

I need to find duplicates with aggregation and lookup with MongoDB and Golang. 我需要通过聚合和MongoDB和Golang查找重复项。 Here is my Event structure. 这是我的Event结构。

// Event describes the model of an Event
type Event struct {
    ID            string      `bson:"_id" json:"_id" valid:"alphanum,printableascii"`
    OldID         string      `bson:"old_id" json:"old_id" valid:"alphanum,printableascii"`
    ParentID      string      `bson:"_parent_id" json:"_parent_id" valid:"alphanum,printableascii"`
    Name          string      `bson:"name" json:"name"`
    Content       string      `bson:"content" json:"content"`
    Slug          string      `bson:"slug" json:"slug"`
    LocationID    string      `bson:"_location_id" json:"_location_id"`
    Price         string      `bson:"price" json:"price"`
    CreatedBy     string      `bson:"created_by" json:"created_by"`
    CreatedAt     time.Time   `bson:"created_at" json:"created_at"`
    ModifiedAt    time.Time   `bson:"modified_at" json:"modified_at"`
}

Here is the request I already have : 这是我已经有的要求:

// Create the pipeline
    pipeline := []bson.M{
        bson.M{
            "$group": bson.M{
                "_id": bson.M{
                    "_location_id": "$_location_id",
                    "start_date":   "$start_date",
                },
                "docs":  bson.M{"$push": "$_id"},
                "count": bson.M{"$sum": 1},
            },
        },
        bson.M{
            "$match": bson.M{
                "count": bson.M{"$gt": 1.0},
            },
        },
    }

    // Do the request
    dupes := []bson.M{}
    err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&dupes)

The events must not have the same start_date and the same _location_id . 这些事件不能具有相同的start_date_location_id This is what I can get : 这就是我所能得到的:

/* 1 */
{
    "_id" : {
        "_location_id" : "4okPZllaoueYC3U2",
        "start_date" : ISODate("2018-04-22T18:00:00.000Z")
    },
    "count" : 2.0,
    "docs" : [ 
        "FFSC2sJcrWgj2FsU", 
        "lwHknTHFfVAzB8ui"
    ]
}

/* 2 */
{
    "_id" : {
        "_location_id" : "pC8rlLVao5c2CeBh",
        "start_date" : ISODate("2018-04-03T19:00:00.000Z")
    },
    "count" : 2.0,
    "docs" : [ 
        "jPRbkINiCExzh2tT", 
        "C8hx92QSZEl7HUIz"
    ]
}

Fine, it is working, but.. I would like to obtain, directly from Mongo, an array of my Event type, and if it is possible, an array of array of Event : [][]*Event . 很好,它正在工作,但是..我想直接从Mongo获得我的Event类型的数组,并且如果可能的话,获得Event: [][]*Event的数组。 In order words, an array of the duplicates (between them). 按照顺序,是重复项的数组(在它们之间)。

For example : 例如 :

// Pipeline
...

// Do the request
events := [][]*Events
err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&events)

Or, do I need to perform the logic with Golang to achieve what I need ? 或者,我是否需要使用Golang执行逻辑以实现所需?

The libraries I use are : 我使用的库是:

"gopkg.in/mgo.v2"
"gopkg.in/mgo.v2/bson"

Note only : no need to take care about the _location_id , I lookup it with a DTO inside my Golang logic. 仅注意:无需关心_location_id ,我可以在Golang逻辑中使用DTO查找它。

EDIT : If I cannot lookup the IDs, can I at least obtain the IDs as an array directly in the result ? 编辑 :如果我无法查找ID,至少可以直接在结果中获取ID作为数组吗? For example : 例如 :

[ 
    "jPRbkINiCExzh2tT", 
    "C8hx92QSZEl7HUIz"
]

This is what I tried to add to the request : {$out: "uniqueIds"} . 这是我尝试添加到请求中的内容: {$out: "uniqueIds"} But it is not working. 但这是行不通的。

Yes you need to perform the logic with Golang to achieve what you need. 是的,您需要使用Golang执行逻辑以实现所需的功能。

You can do like this : 您可以这样:

    type DublesAgregate struct {
    Id        IdStruct        `bson:"_id"`
    Docs      []bson.ObjectId `bson:"docs"`
    Count     string          `bson:"count,omitempty"`
}

type IdStruct struct {
    Location_id      string `bson:"_location_id,omitempty"`
    Start_date       string `bson:"start_date,omitempty"`
}

// Create the pipeline
    pipeline := []bson.M{
        bson.M{
            "$group": bson.M{
                "_id": bson.M{
                    "_location_id": "$_location_id",
                    "start_date":   "$start_date",
                },
                "docs":  bson.M{"$push": "$_id"},
                "count": bson.M{"$sum": 1},
            },
        },
        bson.M{
            "$match": bson.M{
                "count": bson.M{"$gt": 1.0},
            },
        },
    }

// Do the request
    dupes := []DublesAgregate{}
    err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&dupes)

// Get Docs slice
    result := []bson.ObjectId{}
    for _, group := range dupes {
        result = append(result, group.Docs...)
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM