简体   繁体   English

如何根据日期删除MongoDb中的文件

[英]How to delete documents in MongoDb according to date

So what I want to do is group all documents having same hash whose count is more than 1 and only keep the oldest record according to startDate所以我想做的是将所有具有相同hash 且计数大于 1 的文档分组,并且只根据startDate保留最旧的记录

My db structure is as follows:我的数据库结构如下:

[{
  "_id": "82bacef1915f4a75e6a18406",
  "Hash": "cdb3d507734383260b1d26bd3edcdfac",
  "duration": 12,
  "price": 999,
"purchaseType": "Complementary",

  "startDate": {
    "$date": {
      "$numberLong": "1656409841000"
    }
  },
  "endDate": {
    "$date": {
      "$numberLong": "1687859441000"
    }
  }
 
}]

I was using this query which I created我正在使用我创建的这个查询

db.Mydb.aggregate([
{
        "$group": {
         _id: {hash: "$Hash"},
         dups: { $addToSet: "$_id" } ,
         count: { $sum : 1 }
     }
 },{"$sort":{startDate:-1}},
 {
     "$match": {
                  count: { "$gt": 1 }
              }
 }
]).forEach(function(doc) {
   doc.dups.shift();
   db.Mydb.deleteMany({
       _id: {$in: doc.dups}
   });
})

this gives a result like this:这给出了这样的结果:

{ _id: { hash: '1c01ef475d072f207c4485d0a6448334' },
  dups: 
   [ '6307501ca03c94389f09b782',
     '6307501ca03c94389f09b783',
     '62bacef1915f4a75e6a18l06' ],
  count: 3 }

The problem with this is that the _id's in dups array are random everytime I run this query ie not sorted according to startDate field.这样做的问题是,每次我运行此查询时,dups 数组中的 _id 都是随机的,即未根据 startDate 字段排序。 What can be done here?这里可以做什么? Any help is appreciated.任何帮助表示赞赏。 Thanks!谢谢!

After $group stage, startDate field will not pre present in the results, so you can not sort based on that field.$group阶段之后, startDate字段将不会预先出现在结果中,因此您无法根据该字段进行排序。 So, as stated in the comments, you should put $sort stage first in the Aggregation pipeline.因此,如评论中所述,您应该将$sort阶段放在聚合管道中的第一位。

db.Mydb.aggregate([
  { 
    "$sort": { startDate: -1} 
  },
  {
    "$group": {
      _id: {hash: "$Hash"},
      dups: { $addToSet: "$_id" } ,
      count: { $sum : 1 }
  },
  {
    "$match": { count: { "$gt": 1 }
  }
]

Got the solution.得到了解决方案。 I was using $addToSet in the group pipeline stage which does not allow duplicate values.我在不允许重复值的组管道阶段使用 $addToSet。 Instead, I used $push which allows duplicate elements in the array or set.相反,我使用 $push 允许数组或集合中的重复元素。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM