I have a data structure like this :
myStructure = {
1 : ['ab','bc','cd','gh'] ,
2 : ['bc','cd','de'] ,
3 : ['cd','de','ef12','xz','ygd']
}
I want to find the element which has been present in all of the arrays inside 'myStructure' which would be : 'cd'
I'm going to input lots of data into MongoDB and I want to find patterns/duplicates like the example above...
Is there any way to do this with MongoDB ? Are there better ways to do this without MongoDB ?
Update 1 :
I noticed my data structure is not a preferable one... I don't want to be limited to only a couple of keys like '1,2,3' thus I changed the structure to this :
myStructure = [
{key: 1, value: ['ab','bc','cd']} ,
{key: 2, value: ['bc','cd','de']} ,
{key: 3, value: ['cd','de','ef']},
...
]
Thanks for the answers up to now but I'd be thankful if you could answer the question according to the new structure... Thanks...
What you need is aggregation using the $setIntersection
operator.
db.test.aggregate(
[
{ $project: { "commonElement": { $setIntersection: [ "$1", "$2", "$3" ]}}}
]
)
If you meant that all arrays are consistently present then you can do this using $setIntersection
and $redact
:
db.collection.aggregate([
{ "$redact": {
"$cond": {
"if": {
"$gt": [
{ "$size": { "$setIntersection": ["$1","$2", "$3"] } },
0
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$project": {
"intersection": { "$setIntersection": ["$1","$2","$3"] }
}}
])
First to filter anything that does not intersect and then to show the intersection.
So with all arrays in the same document:
{
"_id" : ObjectId("559a22f8369e4e157fe17338"),
"1" : [ "ab", "bc", "cd" ],
"2" : [ "bc", "cd", "de" ],
"3" : [ "cd", "de", "ef" ]
}
{
"_id" : ObjectId("559a2ebc369e4e157fe17339"),
"1" : [ "bc", "ab" ],
"2" : [ "de", "ef" ],
"3" : [ "aj", "kl" ]
}
You get:
{
"_id" : ObjectId("559a22f8369e4e157fe17338"),
"intersection" : [ "cd" ]
}
With individual documents like:
{ "key": 1, "value": ['ab','bc','cd']} ,
{ "key": 2, "value": ['bc','cd','de']},
{ "key": 3, "value": ['cd','de','ef']}
Then process like this:
db.collection.aggregate([
{ "$unwind": "$value" },
{ "$group": {
"_id": "$value",
"keys": { "$push": "$key" },
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$gt": 1 } } }
])
To get the intersection of arrays within arrays in a single document:
{
"id": 1,
"someKey": "abc",
"items": [
{ "key": 1, "value": ['ab','bc','cd']} ,
{ "key": 2, "value": ['bc','cd','de']},
{ "key": 3, "value": ['cd','de','ef']}
]
}
Then $unwind
mutiple times and process:
db.collection.aggregate([
{ "$unwind": "$items" },
{ "$unwind": "$items.value" },
{ "$group": {
"_id": {
"_id": "$_id",
"value": "$items.value"
},
"keys": { "$push": "$items.key" },
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$gt": 1 } } }
])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.