简体   繁体   中英

How to iterate through a nested array in elasticsearch with filter script?

I am trying to filter for a nested field in elasticsearch. Well, I need to return certain documents depending on certain rules. To reproduce the error I'm getting, you can be guided by this example:

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "user": {
        "type": "nested" 
      }
    }
  }
}
PUT my-index-000001/_doc/1
{
  "group": "fans",
  "user": [
    {
      "first": "John",
      "last": "Smith"
    },
    {
      "first": "Alice",
      "last": "White"
    }
  ]
}

As can be seen, we have an array of objects (nested).

I need to apply a script on the nested field where I can go through the array of users.

For example i tried this:

GET my-index-000001/_search
{
  "query": {
    "nested": {
      "path": "user",
      "query": {
        "bool": {
          "filter": [
            {
              "script": {
                "script": {
                  "inline": """
                  def users = doc['user'];
                  for ( int i = 0; i < users.size(); i++ ) {
                    
                  }
                  return true;
                  """
                }
              }
            }
          ]
        }
      }
    }
  }
}

I am getting this error

{
  ...
          "script_stack" : [
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:90)",
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
            "users = doc['user'];\n                  ",
            "            ^---- HERE"
          ],
          ...
          "caused_by" : {
            "type" : "illegal_argument_exception",
            "reason" : "No field found for [user] in mapping with types []"
          }
        }
      }
    ]
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

Elasticsearch version 7.7

Is this possible to do? I have reviewed some answers but it is not clear to me.

Nested documents are powerful because you retain certain attribute connections but there's the downside of not being able to iterate over them as discussed here .


With that being said you could flatten the users attributes using the copy_to feature like so:

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "user__first_flattened": {
        "type": "keyword"
      },
      "user": {
        "type": "nested",
        "properties": {
          "first": {
            "type": "keyword",
            "copy_to": "user__first_flattened"
          }
        }
      }
    }
  }
}

Then

PUT my-index-000001/_doc/1
{
  "group": "fans",
  "user": [
    {
      "first": "John",
      "last": "Smith"
    },
    {
      "first": "Alice",
      "last": "White"
    }
  ]
}

Now you've got access to the field values and can iterate over them (and possibly use the loop index to help locate/identify the correct 'nested' subdocument, if needed.) This only works under the assumption that you iterate over the field that's represented in each nested subdocument so that your loop is not cut short:

GET my-index-000001/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "script": {
            "script": {
              "inline": """
                  def users = doc.user__first_flattened;
                  // users == [Alice, John]
                  for ( int i = 0; i < users.size(); i++ ) {
                    
                  }
                  return true;
                  """
            }
          }
        }
      ]
    }
  }
}

Notice that we're not doing a nested query anymore b/c we're outside of that context and got our flattened field available in the root.


It's also worth knowing that you can replace copy_to with include_in_root which is equally useful here.

List of nested documents of a given parent document is not available with-in the nested context. Hence 'doc['user']' is giving you the exception in the script. However individual nested document can be accessed as below :

GET my-index-000001/_search
{
  "query": {
    "nested": {
      "path": "user",
      "query": {
        "script": {
          "script": "doc['user.first.keyword']=='John'" // <==== This can be done.
        }
      }
    }
  }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM