[英]Mongodb get count of arrayitems over all collection items
我有一個像這樣的mongodb-collection:
{
_id: 123,
name: 'some name',
category: 17,
sizes: ['XS', 'S', 'XL']
},
{
_id: 124,
name: 'another name',
category: 17,
sizes: ['S', 'L', '2XL']
}
我需要兩種不同的方法。 第一個:特定類別中每種尺寸的商品有多少?
{
17: {
XS: 0,
S: 19,
M: 100
},
39: {
XS: 5,
...
}
}
一個解決方案,僅顯示是否有可用大小的項目也是可能的:
{
17: {
XS: false,
S: true,
M: true,
...
},
39: {
XS: true,
...
}
}
第二個問題:我需要相同的名稱,但是在對名稱執行全文搜索之后。
我試圖匯總字段,但是我對如何在數組上執行操作有些迷惑。
任何幫助表示贊賞。
更新 :
在這個答案的幫助下,我更近了一步:
db.so.aggregate(
[
// First, filter by name or something else
// this could also include the category
{
$match: {
'name': {
$regex: /other.*/i
}
}
},
// explode the sizes-array into single documents
{ '$unwind': '$sizes' },
// group and count
{ '$group': {
'_id': '$sizes',
'count': { '$sum': 1 }
}}
]
)
仍然缺少:按類別執行此操作
這是我插入的一些示例數據:
/* 1 */
{
"_id" : 123,
"name" : "some name",
"category" : 17,
"sizes" : [
"XS",
"S",
"XL"
]
}
/* 2 */
{
"_id" : 124,
"name" : "another name",
"category" : 17,
"sizes" : [
"S",
"L",
"2XL"
]
}
/* 3 */
{
"_id" : 125,
"name" : "name",
"category" : 35,
"sizes" : [
"S",
"L",
"2XL"
]
}
您似乎在第一個用例中想要的是按大小和類別分組。 您實際上可以按多個鍵進行分組,這是一個示例:
db.so.aggregate([
// add your match here...
{
'$unwind': '$sizes' // flatten your array
},
// group and count
{
'$group': {
'_id': {
sizes: '$sizes',
category: '$category'
}, // group by both sizes and category
'count': {
'$sum': 1
},
}
},
{
'$group': {
'_id': '$category', // group by category now
sizeCount: { // create an array that includes the size and the count for that size
$push: {
size: "$sizes",
count: "$count"
}
}
}
}
])
該管道將創建以下結果:
{
"_id" : 17,
"sizeCount" : [
{
"size" : "2XL",
"count" : 1.0
},
{
"size" : "XS",
"count" : 1.0
},
{
"size" : "S",
"count" : 2.0
},
{
"size" : "XL",
"count" : 1.0
},
{
"size" : "L",
"count" : 1.0
}
]
}
你會接受嗎?
現在,關於您的第二個用例,您如何對這個類別甚至不存在的大小進行分組? 但通常您可以使用$cond
來操縱結果
因此,在同一示例中,如果您應用以下管道:
db.so.aggregate([
// add your match here ...
{
'$unwind': '$sizes' // flatten your array
},
// group and count
{
'$group': {
'_id': {
sizes: '$sizes',
category: '$category'
}, // group by both sizes and category
'count': {
'$sum': 1
},
}
},
{
'$project': {
_id: 0,
'count': {
$cond: [{
$eq: ["$count", 1.0]
}, "Limited", "Many"]
},
category: "$_id.category",
sizes: "$_id.sizes"
}
},
{
'$group': {
'_id': '$category',
sizeCount: {
$push: {
size: "$sizes",
count: "$count"
}
}
}
}
])
它將產生以下結果(一個示例):
{
"_id" : 17,
"sizeCount" : [
{
"size" : "2XL",
"count" : "Limited"
},
{
"size" : "XS",
"count" : "Limited"
},
{
"size" : "S",
"count" : "Many"
},
{
"size" : "XL",
"count" : "Limited"
},
{
"size" : "L",
"count" : "Limited"
}
]
}
所以基本上在這一行$cond: [{$eq: ["$count", 1.0]}, "Limited", "Many"]}
我們說,如果count
字段僅為1.0,則該尺寸的襯衫是有限,否則我們有很多。 您可以應用任何比較運算符,因此還可以執行以下操作: $cond: [{$lte: ["$count", 2.0]}, "Limited", "Many"]}
注意:投影將很快添加。
您可以
unwind
-> group on category and size
group on category and push
-> group on category and push
-> project
請參考以下查詢。 這將給出沒有任何預測的結果。 我將盡快添加投影以符合您的要求。
var group_by_category_and_sizes = {
"$group": {
"_id": {
"category": "$category",
"size": "$sizes"
},
"count": {
"$sum": 1
}
}
}
var group_by_category_and_push = {
"$group": {
"_id": {
"category": "$_id.category"
},
"combine": {
"$push": { "size": "$_id.size", "count": "$count" }
}
}
}
db.clothings.aggregate([{ "$unwind": "$sizes" }, group_by_category_and_sizes, group_by_category_and_push])
對於文件
{ name: 'some name', category: 17, sizes: ['XS', 'S', 'XL'] }
{ name: 'another name', category: 17, sizes: ['S', 'L', '2XL'] }
{ name: 'another name', category: 18, sizes: ['M', 'S', 'L'] }
這將產生
{
"_id": {
"category": 18
},
"combine": [{
"size": "L",
"count": 1
}, {
"size": "S",
"count": 1
}, {
"size": "M",
"count": 1
}]
} {
"_id": {
"category": 17
},
"combine": [{
"size": "2XL",
"count": 1
}, {
"size": "S",
"count": 2
}, {
"size": "XL",
"count": 1
}, {
"size": "L",
"count": 1
}, {
"size": "XS",
"count": 1
}]
}
這是您幾乎可以找到建議的確切輸出文檔的方法:
db.so.aggregate({
$unwind: "$sizes" // flatten the sizes array
}, {
$group: {
_id: { // group by both category and sizes
category: "$category",
size: "$sizes"
},
count: {
$sum: 1 // count number of documents per bucket
}
}
}, {
$group: {
_id: "$_id.category", // second grouping to get entries per category
sizes: {
$push: { k: "$_id.size", v: "$count" } // create an array of key/value pairs which we will need in this exact shape in the next stage
}
}
}, {
$project: {
"magic": {
$arrayToObject: // transform the key/value pair we generate below into a document
[[{
// the $substr is a hack to transform the numerical category (e.g. 17)
// into a string (not nice, probably not supported but working for now...)
// which is needed for the above $arrayToObject to work
k: { $substr: [ "$_id", 0, -1 ] },
v: {
$arrayToObject: "$sizes" // turn the key/value pairs we created in the previous pipeline stage into a document
}
}]]
}
}
}, {
$replaceRoot: {
newRoot: "$magic" // promote our "magic" field to the document root
}
})
請注意,盡管這可以為您提供正確的輸出,但由於聚合管道非常繁瑣,並且內置了很多不可估量的優勢,因此我不一定建議您沿此路線走。 因此,如果您可以接受@Alex P.建議的輸出結構,那么這無疑將更易於理解和維護,並且速度更快。
關於第二種情況:您可以在$unwind
階段之前添加任意數量的初步$ match階段,以過濾掉所有多余的數據。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.