简体   繁体   English

MongoDB-不同,限制和排序以获得更好的结果

[英]MongoDB - Distinct, Limit, and Sort for better results

I'm trying to develop a query to help mix up results in a search request in MongoDB. 我正在尝试开发一个查询,以帮助在MongoDB中的搜索请求中混合结果。 An example (and very simplified version) of my collection looks like this. 我的收藏的一个示例(非常简化的版本)如下所示。 Each document has a location to query, a ranking on the quality of the listing, and the name of a provider who inserted the listing. 每个文档都有要查询的位置,清单质量的等级以及插入清单的提供者的名称。

[
  {
    "location": "paris",
    "ranking": "998",
    "provider": "Alpha"
  },
  {
    "location": "paris",
    "ranking": "965",
    "provider": "Alpha"
  },
  {
    "location": "paris",
    "ranking": "945",
    "provider": "Alpha"
  },
  {
    "location": "paris",
    "ranking": "933",
    "provider": "Alpha"
  },
  {
    "location": "paris",
    "ranking": "953",
    "provider": "Alpha"
  },
  {
    "location": "paris",
    "ranking": "983",
    "provider": "Alpha"
  },
  {
    "location": "paris",
    "ranking": "700",
    "provider": "Beta"
  },
  {
    "location": "paris",
    "ranking": "745",
    "provider": "Beta"
  },
  {
    "location": "paris",
    "ranking": "670",
    "provider": "Omega"
  },
  {
    "location": "paris",
    "ranking": "885",
    "provider": "Omega"
  },
  {
    "location": "paris",
    "ranking": "500",
    "provider": "Omega"
  },
  {
    "location": "london",
    "ranking": "600",
    "provider": "Omega"
  },
  {
    "location": "london",
    "ranking": "650",
    "provider": "Beta"
  }
]

As you can see, provider Alpha has the most listings, and the best rankings. 如您所见,提供商Alpha具有最多的列表和最佳的排名。 So when I search paris and sort by ranking, all the listings from the Alpha provider get put on top, and the Beta's and Omega's shoved off to the bottom. 因此,当我搜索巴黎并按排名进行排序时,来自Alpha提供程序的所有列表均排在最前面,而Beta和Omega的排名则滑到了底部。

What I'd like to do is limit each provider to 3. So that even though Alphas will still be on top, they'll be limited to 3 allowing for the Betas and Omegas to be higher up. 我想做的就是将每个提供者限制为3个。因此,即使Alphas仍然排名靠前,它们也将被限制为3个,以使Beta和Omega更高。 And then the remaining Alphas can be seen on "page 2" when .skip is used. 然后,当使用.skip时,可以在“第2页”上看到其余的Alpha。

If I was to do this in Python, a synchronous example would look like this. 如果我要在Python中执行此操作,则一个同步示例将如下所示。

#!/usr/bin/env python
# -*- coding: utf-8 -*-

results = []

providersAvailable = colc.find({'location': 'paris'}).distinct('provider')
for provider in providersAvailable:
    search = colc.find({'provider':provider, 'location': 'paris'}).limit(3)
    results = results + list(search)

return sorted(results, key=lambda k: k['ranking']) 

This is heavy, time consuming, and overall just sucks, espicially with a collection of 2.5 million documents. 这是沉重的,耗时的,并且总体来说很糟,尤其是收集了250万份文档。 How could I do this all on Mongos side? 我怎么能在蒙哥斯方面做到这一点? Thanks! 谢谢!

You could try some server side JS eg. 您可以尝试一些服务器端JS,例如。

var providers = db.runCommand({distinct:"colc", key:"provider"}).values
for(p in providers){
   var c = db.colc.find({"provider":providers[p]}).sort({"ranking":-1}).limit(3);
   c.forEach(printjson);
}

but as all JS is interpreted it's not going to be the fastest option. 但是由于所有JS都被解释了,所以它不是最快的选择。

You could play with the aggregation framework, which will be mainly a server side hit eg. 您可以使用聚合框架,该框架主要是服务器方面的问题。

db.colc.aggregate([ 
    {$match: {"location":"paris"}}, 
    {$group:{_id: { "provider": "$provider", "location":"$location"}, 
             "rankings" : { $addToSet: "$ranking"} } } 
]);

But you'll need a bit of client side code to pick out the rankings for each provider, from the return Array. 但是您将需要一些客户端代码来从返回数组中选择每个提供程序的排名。

{
    "result" : [
        {
            "_id" : {
                "provider" : "Omega",
                "location" : "paris"
            },
            "rankings" : [
                "500",
                "885",
                "670"
            ]
        },
        {
            "_id" : {
                "provider" : "Beta",
                "location" : "paris"
            },
            "rankings" : [
                "745",
                "700"
            ]
        },
        {
            "_id" : {
                "provider" : "Alpha",
                "location" : "paris"
            },
            "rankings" : [
                "983",
                "953",
                "933",
                "945",
                "965",
                "998"
            ]
        }
    ],
    "ok" : 1
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM