MongoDB陣列查詢性能

Question

我正在試圖找出像app這樣的約會網站最好的架構。 用戶有一個列表（可能很多），他們可以查看其他用戶列表以“喜歡”和“不喜歡”它們。

目前，我只是存放其他人在上市ID likedBy和dislikedBy陣列。 當用戶“喜歡”列表時，它會將其列表ID放入“喜歡”列表數組中。 但是，我現在想跟蹤用戶喜歡列表的時間戳。 這將用於用戶的“歷史列表”或用於數據分析。

我需要做兩個單獨的查詢：

find all active listings that this user has not liked or disliked before

以及用戶對“喜歡”/“不喜歡”選擇的歷史記錄

find all the listings user X has liked in chronological order

我目前的架構是：

listings
  _id: 'sdf3f'
  likedBy: ['12ac', 'as3vd', 'sadf3']
  dislikedBy: ['asdf', 'sdsdf', 'asdfas']
  active: bool

我可以這樣做嗎？

listings
  _id: 'sdf3f'
  likedBy: [{'12ac', date: Date}, {'ds3d', date: Date}]
  dislikedBy: [{'s12ac', date: Date}, {'6fs3d', date: Date}]
  active: bool

我還在考慮為choices制作新的系列。

choices
  Id
  userId          // id of current user making the choice
  userlistId      // listing of the user making the choice
  listingChoseId  // the listing they chose yes/no
  type
  date

在find all active listings that this user has not liked or disliked before我不確定在另一個集合中具有這些選擇的性能影響。

任何見解將不勝感激！

Answer 1

那么你顯然認為將這些嵌入到“列表”文檔中是一個好主意，這樣你在這里提供的案例的額外使用模式就能正常工作。 考慮到這一點，沒有理由拋棄它。

為了澄清，你似乎想要的結構是這樣的：

{
    "_id": "sdf3f",
    "likedBy": [
         { "userId": "12ac",  "date": ISODate("2014-04-09T07:30:47.091Z") },
         { "userId": "as3vd", "date": ISODate("2014-04-09T07:30:47.091Z") },
         { "userId": "sadf3", "date": ISODate("2014-04-09T07:30:47.091Z") }
    ],
    "dislikedBy": [
        { "userId": "asdf",   "date": ISODate("2014-04-09T07:30:47.091Z") },
        { "userId": "sdsdf",  "date": ISODate("2014-04-09T07:30:47.091Z") },
        { "userId": "asdfas", "date": ISODate("2014-04-09T07:30:47.091Z") }
    ],
    "active": true
}

除了有一個捕獲之外，哪個都很好。 因為您在兩個數組字段中具有此內容，所以您將無法在這兩個字段上創建索引。 這是一個限制，其中只有一個數組類型的字段（或多鍵）可以包含在復合索引中。

因此，要解決第一個查詢無法使用索引的明顯問題，您可以這樣構造：

{
    "_id": "sdf3f",
    "votes": [
        { 
            "userId": "12ac",
            "type": "like", 
            "date": ISODate("2014-04-09T07:30:47.091Z")
        },
        {
            "userId": "as3vd",
            "type": "like",
            "date": ISODate("2014-04-09T07:30:47.091Z")
        },
        { 
            "userId": "sadf3", 
            "type": "like", 
            "date": ISODate("2014-04-09T07:30:47.091Z")
        },
        { 
            "userId": "asdf", 
            "type": "dislike",
            "date": ISODate("2014-04-09T07:30:47.091Z")
        },
        {
            "userId": "sdsdf",
            "type": "dislike", 
            "date": ISODate("2014-04-09T07:30:47.091Z")
        },
        { 
            "userId": "asdfas", 
            "type": "dislike",
            "date": ISODate("2014-04-09T07:30:47.091Z")
        }
    ],
    "active": true
}

這允許覆蓋此表單的索引：

db.post.ensureIndex({
    "active": 1,
    "votes.userId": 1, 
    "votes.date": 1, 
    "votes.type": 1 
})

實際上，您可能需要一些索引來滿足您的使用模式，但現在可以使用可以使用的索引。

涵蓋第一種情況，您有這種形式的查詢：

db.post.find({ "active": true, "votes.userId": { "$ne": "12ac" } })

考慮到你顯然不會為每個用戶提供喜歡和不喜歡的選項，這是有道理的。 按照該索引的順序，至少可以使用active來過濾，因為您的否定條件需要掃描其他所有內容。 任何結構都無法解決這個問題。

對於另一種情況，您可能希望userId在日期之前位於索引中並作為第一個元素。 然后你的查詢很簡單：

db.post.find({ "votes.userId": "12ac" })
    .sort({ "votes.userId": 1, "votes.date": 1 })

但是你可能想知道你突然失去了一些東西，因為得到“喜歡”和“不喜歡”的數量就像測試陣列的大小一樣容易，但現在它有點不同了。 不是使用聚合無法解決的問題：

db.post.aggregate([
    { "$unwind": "$votes" },
    { "$group": {
       "_id": {
           "_id": "$_id",
           "active": "$active"
       },
       "likes": { "$sum": { "$cond": [
           { "$eq": [ "$votes.type", "like" ] },
           1,
           0
       ]}},
       "dislikes": { "$sum": { "$cond": [
           { "$eq": [ "$votes.type", "dislike" ] },
           1,
           0
       ]}}
])

因此，無論您的實際使用形式如何，您都可以存儲文檔的任何重要部分以保留在分組_id ，然后以簡單的方式評估“喜歡”和“不喜歡”的數量。

您也可能不會將條目從喜歡變為不喜歡也可以在單個原子更新中完成。

你可以做的更多，但我更喜歡這種結構，原因如下。

MongoDB陣列查詢性能

問題描述

1 個解決方案

解決方案1
16 已采納 2014-04-09 08:19:59

MongoDB陣列查詢性能

問題描述

1 個解決方案

解決方案1 16 已采納 2014-04-09 08:19:59

解決方案1
16 已采納 2014-04-09 08:19:59