[英]Find duplicates with aggregation and lookup with MongoDB and Golang
I need to find duplicates with aggregation and lookup with MongoDB and Golang. 我需要通过聚合和MongoDB和Golang查找重复项。 Here is my Event
structure. 这是我的Event
结构。
// Event describes the model of an Event
type Event struct {
ID string `bson:"_id" json:"_id" valid:"alphanum,printableascii"`
OldID string `bson:"old_id" json:"old_id" valid:"alphanum,printableascii"`
ParentID string `bson:"_parent_id" json:"_parent_id" valid:"alphanum,printableascii"`
Name string `bson:"name" json:"name"`
Content string `bson:"content" json:"content"`
Slug string `bson:"slug" json:"slug"`
LocationID string `bson:"_location_id" json:"_location_id"`
Price string `bson:"price" json:"price"`
CreatedBy string `bson:"created_by" json:"created_by"`
CreatedAt time.Time `bson:"created_at" json:"created_at"`
ModifiedAt time.Time `bson:"modified_at" json:"modified_at"`
}
Here is the request I already have : 这是我已经有的要求:
// Create the pipeline
pipeline := []bson.M{
bson.M{
"$group": bson.M{
"_id": bson.M{
"_location_id": "$_location_id",
"start_date": "$start_date",
},
"docs": bson.M{"$push": "$_id"},
"count": bson.M{"$sum": 1},
},
},
bson.M{
"$match": bson.M{
"count": bson.M{"$gt": 1.0},
},
},
}
// Do the request
dupes := []bson.M{}
err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&dupes)
The events must not have the same start_date and the same _location_id . 这些事件不能具有相同的start_date和_location_id 。 This is what I can get : 这就是我所能得到的:
/* 1 */
{
"_id" : {
"_location_id" : "4okPZllaoueYC3U2",
"start_date" : ISODate("2018-04-22T18:00:00.000Z")
},
"count" : 2.0,
"docs" : [
"FFSC2sJcrWgj2FsU",
"lwHknTHFfVAzB8ui"
]
}
/* 2 */
{
"_id" : {
"_location_id" : "pC8rlLVao5c2CeBh",
"start_date" : ISODate("2018-04-03T19:00:00.000Z")
},
"count" : 2.0,
"docs" : [
"jPRbkINiCExzh2tT",
"C8hx92QSZEl7HUIz"
]
}
Fine, it is working, but.. I would like to obtain, directly from Mongo, an array of my Event type, and if it is possible, an array of array of Event : [][]*Event
. 很好,它正在工作,但是..我想直接从Mongo获得我的Event类型的数组,并且如果可能的话,获得Event: [][]*Event
的数组。 In order words, an array of the duplicates (between them). 按照顺序,是重复项的数组(在它们之间)。
For example : 例如 :
// Pipeline
...
// Do the request
events := [][]*Events
err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&events)
Or, do I need to perform the logic with Golang to achieve what I need ? 或者,我是否需要使用Golang执行逻辑以实现所需?
The libraries I use are : 我使用的库是:
"gopkg.in/mgo.v2"
"gopkg.in/mgo.v2/bson"
Note only : no need to take care about the _location_id
, I lookup it with a DTO inside my Golang logic. 仅注意:无需关心_location_id
,我可以在Golang逻辑中使用DTO查找它。
EDIT : If I cannot lookup the IDs, can I at least obtain the IDs as an array directly in the result ? 编辑 :如果我无法查找ID,至少可以直接在结果中获取ID作为数组吗? For example : 例如 :
[
"jPRbkINiCExzh2tT",
"C8hx92QSZEl7HUIz"
]
This is what I tried to add to the request : {$out: "uniqueIds"}
. 这是我尝试添加到请求中的内容: {$out: "uniqueIds"}
。 But it is not working. 但这是行不通的。
Yes you need to perform the logic with Golang to achieve what you need. 是的,您需要使用Golang执行逻辑以实现所需的功能。
You can do like this : 您可以这样:
type DublesAgregate struct {
Id IdStruct `bson:"_id"`
Docs []bson.ObjectId `bson:"docs"`
Count string `bson:"count,omitempty"`
}
type IdStruct struct {
Location_id string `bson:"_location_id,omitempty"`
Start_date string `bson:"start_date,omitempty"`
}
// Create the pipeline
pipeline := []bson.M{
bson.M{
"$group": bson.M{
"_id": bson.M{
"_location_id": "$_location_id",
"start_date": "$start_date",
},
"docs": bson.M{"$push": "$_id"},
"count": bson.M{"$sum": 1},
},
},
bson.M{
"$match": bson.M{
"count": bson.M{"$gt": 1.0},
},
},
}
// Do the request
dupes := []DublesAgregate{}
err := session.DB(shared.DatabaseNamespace).C(dao.collection).Pipe(pipeline).All(&dupes)
// Get Docs slice
result := []bson.ObjectId{}
for _, group := range dupes {
result = append(result, group.Docs...)
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.