简体   繁体   中英

How to $pull elements from an array, $where elements' string length > a large number?

And old slash escaping bug left us with some messed up data, like so:

{
    suggestions: [
        "ok",
        "not ok /////////// ... 10s of KBs of this ... //////",
    ]
}

I would like to just pull those bad values out of the array. My first idea was to $pull based on a regex that matches 4 "/" characters, but it appears that regexes to not work on large strings:

db.notes.count({suggestions: /\/\/\/\//}) // returns 0
db.notes.count({suggestions: {$regex: "////"}}) // returns 0

My next idea was to use a $where query to find documents that have suggestion strings that are longer than 1000. That query works:

db.notes.count({
    suggestions: {$exists: true},
    $where: function() {
        return !!this.suggestions.filter(function (item) {
            return (item || "").length > 1000;
        }).length
    }
})
// returns a plausible number

But a $where query can't be used as the condition in a $pull update.

db.notes.update({
    suggestions: {$exists: true},
}, {
    $pull: {
        suggestions: {
            $where: function() {
                return !!this.suggestions.filter(function (item) {
                    return (item || "").length > 1000;
                }).length
            }
        }
    }
})

throws

WriteResult({
    "nMatched" : 0,
    "nUpserted" : 0,
    "nModified" : 0,
    "writeError" : {
        "code" : 81,
        "errmsg" : "no context for parsing $where"
    }
})

I'm running out of ideas. Will I have to iterate over the entire collection, and $set: {suggestions: suggestions.filter(...)} for each document individually? Is there no better way to clean bad values out of an array of large strings in MongoDB?

(I'm only adding the "javascript" tag to get SO to format the code correctly)

The simple solution pointed out in the question comments should have worked. It does work with a test case that is a recreation of the original problem. Regexes can match on large strings, there is no special restriction there.

db.notes.updateOne({suggestions: /\/\//}, { "$pull": {suggestions: /\/\//}})

Since this didn't work for me, I ended up going with what the question discussed: updating all documents individually by filtering the array elements based on string length:

db.notes.find({
    suggestions: {$exists: true}
}).forEach(function(doc) {
    doc.suggestions = doc.suggestions.filter(function(item) {
        return (item || "").length <= 1000;
    }); db.notes.save(doc);
});

It ran slow, but that wasn't really a problem in this case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM