简体   繁体   English

Elasticsearch字词查询可排除大量用户

[英]Elasticsearch Terms Query exclude large amount of users

I'm working on a tinder like app. 我正在开发类似app的游戏。 In order to exclude profiles that user has swiped before, I use a "must_not" query like this: 为了排除用户之前刷过的个人资料,我使用“ must_not”查询,如下所示:

must_not : [{"terms": { "swipedusers": ["userid1", "userid1", "userid1"…]}}] must_not:[{“ terms”:{“ swipedusers”:[“ userid1”,“ userid1”,“ userid1”…]}}}]

I wonder what are the limits using this approach? 我想知道使用这种方法的局限性是什么? is this a scalable approach that would also work when the swipedusers array contains 2000 user ids? 这种可扩展的方法在swipedusers数组包含2000个用户ID时也可以使用吗? If there is a better scalable approach to this I would be happy to know... 如果有更好的可扩展方法,我很高兴知道...

there is a better approach! 有更好的办法! and it called "terms lookup", is something like the traditional join that you could do on relational databases... 它称为“术语查找”,就像您可以在关系数据库上执行的传统联接一样。

I could try to explain you here, but, all the information that you need is well documented on the official Elastic Search page: 我可以在这里向您解释,但是,您需要的所有信息都在官方的Elastic Search页面上有详细记录:

https://www.elastic.co/guide/en/elasticsearch/reference/5.0/query-dsl-terms-query.html#query-dsl-terms-lookup https://www.elastic.co/guide/zh-CN/elasticsearch/reference/5.0/query-dsl-terms-query.html#query-dsl-terms-lookup

The final solution is having 2 indices, one for the registered users and another one to track swipes for each user. 最终解决方案是使用2个索引,一个用于注册用户,另一个用于跟踪每个用户的刷卡。 Then, for each swipe, you should update the document containing current user swipes... Here you will need to add elements to an array, and this is another problem in ElasticSearch (big problem if you are using AWS managed ElasticSearch) that only can be solved using scripting... More info at https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html#_using_scripts_to_make_partial_updates 然后,对于每次滑动,您应该更新包含当前用户滑动的文档...在这里,您需要向数组添加元素,这是ElasticSearch中的另一个问题(如果使用的是AWS托管ElasticSearch,则是一个大问题),可以使用脚本解决...更多信息, 访问https://www.elastic.co/guide/zh/elasticsearch/guide/current/partial-updates.html#_using_scripts_to_make_partial_updates

For your case, the query will result in something like: 对于您的情况,查询将导致类似:

GET /possible_matches/_search
{
    "query" : {
        "terms" : {
            "user" : {
                "index" : "swiped",
                "type" : "users",
                "id" : "current-user-id",
                "path" : "swipedUserId"
            }
        }
    }
}

Another thing that you should take in account is the replication configuration for the swipes index, since each node will perform "joins" with that index, is highly recommended to have a full copy of that index in each node. 您还应该考虑的另一件事是滑动索引的复制配置,因为每个节点都将与该索引执行“联接”,因此强烈建议在每个节点中具有该索引的完整副本。 You could achieve this creating the index with the "auto_expand_replicas" with "0-all" value. 您可以通过使用值为“ 0-all”的“ auto_expand_replicas”创建索引来实现。

PUT /swipes
{
    "settings": {
        "auto_expand_replicas": "0-all"
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM