简体   繁体   English

查询时查询上下文和过滤上下文之间的区别

[英]Difference between Query Context and Filter Context while Querying

What is the difference between the Query Context and the Filter Context in the Elastic Search in Query DSL. 在查询DSL中的弹性搜索中, 查询上下文过滤上下文之间有什么区别。

My Understanding is Query Context- How well the document matches the query parameters. 我的理解是查询上下文-文档与查询参数的匹配程度。

Ex: 例如:

    { "match": { "title":   "Search"        }}

If I am searching for the documents with title 'Search' then if I contains two documents 如果我要搜索标题为“搜索”的文档,则如果我包含两个文档

      i)title:"Search"    
      ii)title:"Search 123"

Then first document is a perfect match and document two is a semi-match. 然后,第一个文档是完全匹配,而第二个文档是半匹配。 Then the first document is given in the first place and the second document given the second place. 然后,第一个文档放在第一位,第二个文档放在第二位。 Is my understanding correct? 我的理解正确吗?

Filter Context : 过滤器上下文
Ex: 例如:

{ "term":  { "status": "published" }}

If I am searching for the documents with status 'published' then if I contains two documents 如果我要搜索状态为“已发布”的文档,则如果我包含两个文档

      i)status:"published"    
      ii)status:"published 123"

Then the first document is perfect so it is returned and the second match is not a perfect match so it is not returned. 然后,第一个文档是完美的,因此将其返回,而第二个匹配项不是完美的匹配,因此将不返回它。 Is my understanding correct? 我的理解正确吗?

Basically in Query context, the elastic search scans all the documents and tries to find out how well the documents match the query, means the score will will be calculated for each documents. 基本上在查询上下文中,弹性搜索会扫描所有文档,并尝试找出文档与查询的匹配程度,这意味着将为每个文档计算分数。 Where as in filter context,it will just checks whether the documents matches the query or not ie, only yes or no will be returned. 在过滤器上下文中,它将仅检查文档是否与查询匹配,即仅返回是或否。 The filter queries does not contribute to the score of the document. 筛选器查询不会增加文档的分数。

Next coming to the difference between the match and term queries , if you mapped a field to keyword then that field will be not analysed and its inverted index contains the whole term as it is, ie is if status is mapped to keyword then if you insert "published 123" in status field , then its inverted index contains ["published 123"] and if status is mapped to text then while inserting data to status filed it is analysed for ex: if you insert "published 123" then its inverted index will be ["published","123"]. 接下来是匹配查询和术语查询之间的区别,如果您将字段映射到关键字,则将不分析该字段,并且其倒排索引将包含整个术语,即状态是否映射到关键字,那么如果您插入状态字段中的“已发布123”,则其反向索引包含[“已发布123”],如果将状态映射到文本,则在将数据插入状态字段时会对其进行分析:例如,如果您插入“已发布123”,则其反向索引将为[“ published”,“ 123”]。 So whenever you use term query for keyword fields the query string will not be analysed and it tries to find exact term in the inverted index and if you use match query it analyses the query string and it returns all the doc's that contain the one of the analysed string of query in it's inverted index 因此,每当您对关键字字段使用术语查询时,查询字符串都不会被分析,并且会尝试在倒排索引中查找确切的术语;如果您使用匹配查询,它将分析查询字符串,并返回所有包含以下内容之一的文档在其反向索引中分析查询字符串

Your understanding about the difference between term and match queries is correct at the most basic level but like Jettro commented in the filter query you mentioned both the documents will be selected. 您对术语查询和匹配查询之间差异的理解在最基本的水平上是正确的,但是就像您提到的过滤查询中的Jettro评论一样,两个文档都将被选中。 When doing a term query it really depends what kind of analyzer you are using and how that affects the terms that are stored in inverted index that lucene uses. 在进行术语查询时,它实际上取决于您使用的是哪种分析仪,以及它如何影响存储在Lucene使用的倒排索引中的术语。 To quote an example from the Elasticsearch: Th Definitive Guide "if you were to index ["Foo","Bar"] into an exact value not_analyzed field, or Foo Bar into an analyzed field with the whitespace analyzer, both would result in having the two terms Foo and Bar in the inverted index." 引用Elasticsearch的示例:权威指南“如果您要使用空格分析器将[“ Foo”,“ Bar”]索引为精确值not_analyzed字段,或将Foo Bar索引为已分析字段,则两者都将导致倒排索引中的两个词Foo和Bar。”

Now under the hood the term query will search all the terms in the inverted index for your query term and even if one of them matches it will be returned as a result. 现在,术语查询将在反向索引中搜索您的查询术语中的所有术语,即使其中之一匹配也会作为结果返回。 So in the first case there is only "published" in the inverted index but in the second case too there are both terms "published" and "123", so both documents will be returned as matches. 因此,在第一种情况下,倒排索引中只有“已发布”,但在第二种情况下,也都有术语“已发布”和“ 123”,因此两个文档都将作为匹配项返回。

It also is important to remember that the term query looks in the inverted index for the exact term only; 同样重要的是要记住,词条查询仅在倒排索引中查找确切的词条; it won't match any variants like "Published" or "publisheD" with "published". 它不会与“已发布”或“已发布”之类的任何变体匹配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM