简体   繁体   English

使用elastic4s搜索整个URL

[英]Searching for whole URL using elastic4s

I'm using elastic4s in order to index and search data in ES. 我正在使用elastic4s以便在ES中索引和搜索数据。 Part of the document I'm storing contains a url field which I need to search for (the entire url). 我要存储的文档的一部分包含一个我需要搜索的URL字段(整个URL)。 The problem occurs when I search for a document containing the url field and get 0 results. 当我搜索包含url字段的文档并获得0个结果时,会出现问题。

For the purpose of this search I define a mapping ahead of inserting the data into the index: 为了进行此搜索,我在将数据插入索引之前定义了一个映射:

client.execute {
  create index <my_index> mappings {
    <my_type> as {
      "url" typed StringType analyzer NotAnalyzed
    }
  }
}

I'm inserting the data: 我正在插入数据:

client.execute { index into <my_index> -> <my_type> fields (
  "url" -> "http://sample.com"
  )
}.await

And I search for the documents: 然后我搜索文档:

val filter =
"""
  |{
  |    "filtered" : {
  |        "query" : {
  |            "match_all" : {}
  |        },
  |        "filter" : {
  |            "bool" : {
  |              "should" : [
  |                { "bool" : {
  |                  "must" : [
  |                     { "term" : { "url" : "http://sample.com" } }
  |                  ]
  |                } }
  |              ]
  |            }
  |        }
  |    }
  |}
""".stripMargin

client.execute { search in <my_index> -> <my_type> rawQuery { filter } }.await

I get 0 results after executing this query. 执行此查询后,我得到0个结果。 What am I doing wrong? 我究竟做错了什么?

Thanks! 谢谢!

Problem solved. 问题解决了。

In the mapping, instead of doing: 在映射中,而不是这样做:

client.execute {
  create index <my_index> mappings {
    <my_type> as {
      "url" typed StringType analyzer NotAnalyzed
    }
  }
}

I should have done: 我应该做的:

client.execute {
  create index <my_index> mappings {
    <my_type> as {
      "url" typed StringType index "not_analyzed"
    }
  }
}

or 要么

client.execute {
  create index <my_index> mappings {
    <my_type> as {
      "url" typed StringType index NotAnalyzed
    }
  }
}

I think your query can be simplified a bit, to something like: 我认为您的查询可以简化为:

{
  "query": {
    "term": {
      "url": {
        "value": "http://example.com"
      }
    }
  }
}

Weirdly, I would expect your query to return all documents, due to the nesting of your boolean queries. 奇怪的是,由于布尔查询的嵌套,我希望您的查询返回所有文档。 should means documents that match any query in the array are scored higher than those not matching. should意味着与数组中任何查询匹配的文档的得分均高于不匹配的文档。 So, a should containing only one must should return all documents, with matching documents scored higher than non matching. 因此,一个should只包含一个must要返回所有的文件,用匹配的文档得分比不匹配更高。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM