[英]PHP MongoDB selecting fields by key (distinct)
I have a database with lottery games across the world. 我有一个世界各地的彩票游戏数据库。 This is how each document look like:
每个文档的外观如下:
Each game can appear in different country_code or state_code if country have states (Canada, USA). 如果国家/地区有州(美国,加拿大),则每个游戏可以以不同的国家/地区代码或州/省代码显示。
Selecting all game_id's and then all the countries and/or states it belongs to is done like this: 选择所有game_id,然后选择其所属的所有国家和/或州,如下所示:
// get all games
// $colCurrent = MongoCollection Object
$gamesRes = $colCurrent->distinct('game_id');
foreach($gamesRes as $gameId) {
$disCountries = $colCurrent->distinct('country_code',array('game_id' => $gameId));
$disStates = $colCurrent->distinct('state_code',array('game_id' => $gameId));
}
I believe this is inappropriate way to do it, as it does a lot of queries to database. 我认为这是不合适的方法,因为它会对数据库执行很多查询。 I've tried using aggregate function, but it only select 1 field like distinct.
我试过使用聚合函数,但它只选择1个字段(如distinct)。
Anyone can help optimizing this query? 任何人都可以帮助优化此查询?
Thanks a lot! 非常感谢!
Depending on what you are trying to achieve and the size of your data set, there are a few different approaches you can take. 根据要实现的目标和数据集的大小,可以采用几种不同的方法。
Some examples using the Aggregation Framework in the mongo
shell (MongoDB 2.2+): 在
mongo
shell(MongoDB 2.2+)中使用Aggregation Framework的一些示例:
1) Find all games and for each game create the set of unique country_code and state_code values: 1)查找所有游戏,并为每个游戏创建一组唯一的country_code和state_code值:
db.games.aggregate(
{ $group: {
_id: { gameId: "$game_id" },
countries: { $addToSet: "$country_code" },
states: { $addToSet: "$state_code" }
}}
)
2) Find all games, and group by the unique combination of gameId, country_code, and state_code including a count: 2)查找所有游戏,并按gameId,country_code和state_code的唯一组合(包括计数)进行分组:
db.games.aggregate(
{ $group: {
_id: {
gameId: "$game_id",
country_code: "$country_code",
state_code: "$state_code"
},
total: { $sum: 1 }
}}
)
In this second example, note that the _id
used for grouping can include multiple fields. 在第二个示例中,请注意用于分组的
_id
可以包括多个字段。
If you don't want to group on all the documents in the collection, you could make these aggregations more efficient by starting with the $match
operator to limit the pipeline to the data you need ( $match
can also take advantage of a suitable index). 如果您不想对集合中的所有文档进行分组,则可以通过使用
$match
运算符开始以将管道限制为所需的数据来使这些聚合更加有效( $match
也可以利用合适的索引)。
Assuming you mean the total number of distinct countries and the total number of distinct states for each game_name (assuming one per game_id and that this is more readable [ interchange if needed ] ) 假设您的意思是每个game_name的不同国家/地区的总数和不同州的总数(假设每个game_id一个,并且更容易理解[如果需要,可以互换])
Posting as mongo shell for general clarity, adapt to your driver and language as required: 为了清楚起见,以mongo shell形式发布,并根据需要适应您的驱动程序和语言:
db.lottery.aggregate([
{$project: { country_code: 1, state_code: 1, game_name: 1 } },
{$group: {
_id: "$game_name",
countries: {$addToSet: "$country_code"},
states: {$addToSet: "$state_code"}
}},
{$unwind: "$countries"},
{$group:{ _id: { id: "$_id", states: "$states" }, country_count: {$sum: 1 } }},
{$project: { _id: 0, game: "$_id.id", countries: "$country_count", states: "$_id.states" }},
{$unwind: "$states"},
{$group: { _id: { id: "$game", countries: "$countries" }, state_count: {$sum: 1 } }},
{$project: { _id: 0, game: "$_id.id", countries: "$_id.countries", states: "$state_count" }},
{$sort: { game: 1 }}
])
So there are a few fancy stages here: 因此,这里有一些幻想阶段:
Phew! 唷! A reasonably hefty aggregate but it does show a way to work out the problem.
相当大的总量,但确实显示出解决此问题的方法。
DISCLAIMER: 免责声明:
I have made the huge assumption here that your data does make some sense already and that there are not multiple records of games per country and/or per state. 我在这里做了一个巨大的假设,即您的数据确实已经具有一定意义,并且每个国家和/或每个州都没有多个游戏记录。 The additional "I didn't do it" part is that your code did not discern states within countries so "I didn't do it either" :-P
额外的“我没有做”部分是您的代码无法识别国家/地区内的州,因此“我也没有做” :-P
You can add in $group stages to do that though. 您可以添加$ group阶段来执行此操作。 Part of the fun of programming is learning and working out how to do things by yourself.
编程乐趣的一部分是学习和制定如何自己做事的方法。 So this should be a good place to start if not a perfect fit.
因此,如果不是很合适的话,这应该是一个很好的起点。
The reference is a really good place for learning how to apply all the operators used here. 该参考书是学习如何应用此处使用的所有运算符的真正好地方。 Apply one stage at a time ( data size permitting ) to get a good idea of what is going on in each step.
一次应用一个阶段(在数据大小允许的情况下),以了解每个步骤的进展情况。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.