简体   繁体   English

PHP MongoDB通过键选择字段(不同)

[英]PHP MongoDB selecting fields by key (distinct)

I have a database with lottery games across the world. 我有一个世界各地的彩票游戏数据库。 This is how each document look like: 每个文档的外观如下:

在此处输入图片说明

Each game can appear in different country_code or state_code if country have states (Canada, USA). 如果国家/地区有州(美国,加拿大),则每个游戏可以以不同的国家/地区代码或州/省代码显示。

Selecting all game_id's and then all the countries and/or states it belongs to is done like this: 选择所有game_id,然后选择其所属的所有国家和/或州,如下所示:

// get all games
// $colCurrent = MongoCollection Object
$gamesRes = $colCurrent->distinct('game_id');
foreach($gamesRes as $gameId) {
    $disCountries = $colCurrent->distinct('country_code',array('game_id' => $gameId));
    $disStates = $colCurrent->distinct('state_code',array('game_id' => $gameId));
}

I believe this is inappropriate way to do it, as it does a lot of queries to database. 我认为这是不合适的方法,因为它会对数据库执行很多查询。 I've tried using aggregate function, but it only select 1 field like distinct. 我试过使用聚合函数,但它只选择1个字段(如distinct)。

Anyone can help optimizing this query? 任何人都可以帮助优化此查询?

Thanks a lot! 非常感谢!

Depending on what you are trying to achieve and the size of your data set, there are a few different approaches you can take. 根据要实现的目标和数据集的大小,可以采用几种不同的方法。

Some examples using the Aggregation Framework in the mongo shell (MongoDB 2.2+): mongo shell(MongoDB 2.2+)中使用Aggregation Framework的一些示例:

1) Find all games and for each game create the set of unique country_code and state_code values: 1)查找所有游戏,并为每个游戏创建一组唯一的country_code和state_code值:

db.games.aggregate(
    { $group: {
        _id: { gameId: "$game_id" },
        countries: { $addToSet: "$country_code" },
        states: { $addToSet: "$state_code" }
    }}
)

2) Find all games, and group by the unique combination of gameId, country_code, and state_code including a count: 2)查找所有游戏,并按gameId,country_code和state_code的唯一组合(包括计数)进行分组:

db.games.aggregate(
    { $group: {
        _id: {
            gameId: "$game_id",
            country_code: "$country_code",
            state_code: "$state_code"
        },
        total: { $sum: 1 }
    }}
)

In this second example, note that the _id used for grouping can include multiple fields. 在第二个示例中,请注意用于分组的_id可以包括多个字段。

If you don't want to group on all the documents in the collection, you could make these aggregations more efficient by starting with the $match operator to limit the pipeline to the data you need ( $match can also take advantage of a suitable index). 如果您不想对集合中的所有文档进行分组,则可以通过使用$match运算符开始以将管道限制为所需的数据来使这些聚合更加有效( $match也可以利用合适的索引)。

Assuming you mean the total number of distinct countries and the total number of distinct states for each game_name (assuming one per game_id and that this is more readable [ interchange if needed ] ) 假设您的意思是每个game_name的不同国家/地区的总数和不同州的总数(假设每个game_id一个,并且更容易理解[如果需要,可以互换])

Posting as mongo shell for general clarity, adapt to your driver and language as required: 为了清楚起见,以mongo shell形式发布,并根据需要适应您的驱动程序和语言:

db.lottery.aggregate([
    {$project: { country_code: 1, state_code: 1, game_name: 1 } },
    {$group: { 
        _id: "$game_name",
        countries: {$addToSet: "$country_code"},
        states: {$addToSet: "$state_code"}
    }},
    {$unwind: "$countries"},
    {$group:{ _id: { id: "$_id", states: "$states" }, country_count: {$sum: 1 } }},
    {$project: { _id: 0, game: "$_id.id", countries: "$country_count", states: "$_id.states" }},
    {$unwind: "$states"},
    {$group: { _id: { id: "$game", countries: "$countries" }, state_count: {$sum: 1 } }},
    {$project: { _id: 0, game: "$_id.id", countries: "$_id.countries", states: "$state_count" }},
    {$sort: { game: 1 }}
])

So there are a few fancy stages here: 因此,这里有一些幻想阶段:

  1. Project the fields that are needed only 投影仅需要的字段
  2. Group on the game and push each country and state into an array of its own 分组参与游戏,将每个国家和州推向自己的阵营
  3. Now unwind the countries to get one record per country 现在放松国家,使每个国家获得一个记录
  4. Group a sum of the countries while retaining the game and states field array 对国家/地区进行分组,同时保留游戏和状态字段数组
  5. Project --optional-- to make the records appear more natural. 项目-可选-使记录看起来更自然。 $group messes with _id $ group与_id混淆
  6. Unwind the states to get one record per state 放松状态以获取每个状态一个记录
  7. Group a sum of the states while retaining game and countries count 将州的总和分组,同时保留游戏和国家/地区计数
  8. Project into something more natural 投射到更自然的东西
  9. Sorting --optional-- by what you like. 根据您的喜好进行排序(可选)。 In this case the name of the game 在这种情况下,游戏名称

Phew! 唷! A reasonably hefty aggregate but it does show a way to work out the problem. 相当大的总量,但确实显示出解决此问题的方法。

DISCLAIMER: 免责声明:

I have made the huge assumption here that your data does make some sense already and that there are not multiple records of games per country and/or per state. 我在这里做了一个巨大的假设,即您的数据确实已经具有一定意义,并且每个国家和/或每个州都没有多个游戏记录。 The additional "I didn't do it" part is that your code did not discern states within countries so "I didn't do it either" :-P 额外的“我没有做”部分是您的代码无法识别国家/地区内的州,因此“我也没有做” :-P

You can add in $group stages to do that though. 您可以添加$ group阶段来执行此操作。 Part of the fun of programming is learning and working out how to do things by yourself. 编程乐趣的一部分是学习和制定如何自己做事的方法。 So this should be a good place to start if not a perfect fit. 因此,如果不是很合适的话,这应该是一个很好的起点。

The reference is a really good place for learning how to apply all the operators used here. 参考书是学习如何应用此处使用的所有运算符的真正好地方。 Apply one stage at a time ( data size permitting ) to get a good idea of what is going on in each step. 一次应用一个阶段(在数据大小允许的情况下),以了解每个步骤的进展情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM