And old slash escaping bug left us with some messed up data, like so:
{
suggestions: [
"ok",
"not ok /////////// ... 10s of KBs of this ... //////",
]
}
I would like to just pull those bad values out of the array. My first idea was to $pull
based on a regex that matches 4 "/" characters, but it appears that regexes to not work on large strings:
db.notes.count({suggestions: /\/\/\/\//}) // returns 0
db.notes.count({suggestions: {$regex: "////"}}) // returns 0
My next idea was to use a $where
query to find documents that have suggestion
strings that are longer than 1000. That query works:
db.notes.count({
suggestions: {$exists: true},
$where: function() {
return !!this.suggestions.filter(function (item) {
return (item || "").length > 1000;
}).length
}
})
// returns a plausible number
But a $where
query can't be used as the condition in a $pull
update.
db.notes.update({
suggestions: {$exists: true},
}, {
$pull: {
suggestions: {
$where: function() {
return !!this.suggestions.filter(function (item) {
return (item || "").length > 1000;
}).length
}
}
}
})
throws
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 81,
"errmsg" : "no context for parsing $where"
}
})
I'm running out of ideas. Will I have to iterate over the entire collection, and $set: {suggestions: suggestions.filter(...)}
for each document individually? Is there no better way to clean bad values out of an array of large strings in MongoDB?
(I'm only adding the "javascript" tag to get SO to format the code correctly)
The simple solution pointed out in the question comments should have worked. It does work with a test case that is a recreation of the original problem. Regexes can match on large strings, there is no special restriction there.
db.notes.updateOne({suggestions: /\/\//}, { "$pull": {suggestions: /\/\//}})
Since this didn't work for me, I ended up going with what the question discussed: updating all documents individually by filtering the array elements based on string length:
db.notes.find({
suggestions: {$exists: true}
}).forEach(function(doc) {
doc.suggestions = doc.suggestions.filter(function(item) {
return (item || "").length <= 1000;
}); db.notes.save(doc);
});
It ran slow, but that wasn't really a problem in this case.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.