简体   繁体   中英

MongoDB query to find documents with variations

Example MongoDB document:

{
  name: "something"
  product: "ABC-123"
}

The problem is that product may not always follow the same naming convention. It could be any of the following

"ABC-123"
"ABC123"
"ABC 123"

So if I search for "ABC-123", I want any document similarly matching regardless of the variation in naming convention.

Edit: You can simply use $regex in your query using the expression ^ABC(?:.*?)\\\\d+$ , like so:

Example MongoDB document:

db={
  "products": [
    {
      "name": "product A",
      "product": "ABC-123"
    },
    {
      "name": "product B",
      "product": "ABC123"
    },
    {
      "name": "product C",
      "product": "ABC-123"
    }
  ]
}

Query:

db.products.find({
  "product": {
    "$regex": "^ABC(?:.*?)\\d+$"
  }
})

Demo: https://mongoplayground.net/p/WdqTg7LCZIk


We might be able to find an expression for this problem. Maybe, let's start with an expression similar to:

product:\s+"(.+?)"

Demo

Here, we are using product:\\s+" as the left boundary, then we collect any chars, then we bound it from right with " .

 const regex = /product:\\s+"(.+?)"/gm; const str = `{ name: "something" product: "ABC-123" }`; let m; while ((m = regex.exec(str)) !== null) { // This is necessary to avoid infinite loops with zero-width matches if (m.index === regex.lastIndex) { regex.lastIndex++; } // The result can be accessed through the `m`-variable. m.forEach((match, groupIndex) => { console.log(`Found match, group ${groupIndex}: ${match}`); }); } 

Or we can extend it to what we like to capture and not to capture:

(?:product:\s+")(.+?)(?:")

DEMO

在此输入图像描述

If you variation is just that and those are your 3 possibilities then the answer from Emma is just what you need. Another option you have if the regular expression gets out of hand and you end up having a lot of different product variations is $text search/index AND regEx.

For example:

db.getCollection('COLNAME').find({
  $or: [
    {
      $text: {$search: 'abc'}  // By default it is case insensitive
    },
    {
      product: {"$regex": "YOUR_REGEX_HERE"}
    }
  ]
})

This would be also performant since you would have a text index on product as well as a regular index. This would also take care of cases like XXX-ABC and any other variation you might not know/have. So something to think about.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM