简体   繁体   English

如何在php中对mongodb文档进行分组

[英]how to group mongodb documents in php

I am using mongo server 3.6 and php as backend language.我使用 mongo server 3.6 和 php 作为后端语言。 Also I am using the latest php-mongo library to talk to new updated drivers.此外,我正在使用最新的 php-mongo 库来与新的更新驱动程序对话。

I have a collection with 10 million records in it like so :我有一个包含 1000 万条记录的集合,如下所示:

[
  {
    "did": "123456",
    "did_usage": "1",
    "did_timestamp": "15012"
  },
  {
    "did": "4567811",
    "did_usage": "1",
    "did_timestamp": "15013"
  },
  {
    "did": "46465464",
    "did_usage": "2",
    "did_timestamp": "15014"
  },
  {
    "did": "7894446",
    "did_usage": "2",
    "did_timestamp": "15015"
  },
  {
    "did": "65646131",
    "did_usage": "3",
    "did_timestamp": "15016"
  },
  {
    "did": "7989464",
    "did_usage": "2",
    "did_timestamp": "15017"
  },
  {
    "did": "651651664",
    "did_usage": "1",
    "did_timestamp": "15018"
  }.......
]

Now I want to find a single unique document which have least did usage and least did timestamp.现在我想找到一个唯一的文档,它使用最少,时间戳最少。

Till now I have reached to find the single unique document using :到目前为止,我已经使用以下方法找到了单个唯一文档:

$sample = array('$sample' => array('size' => 1));
$pipeline = array($match, $group, $project, $sample);
$cursor = $collection->aggregate($pipeline);

I want some help regarding the $group .我需要一些有关$group帮助。 I have tried this我试过这个

$group = array('$group' => array('_id' => '$did_usage', 'did_usage_timestamp' => array('$min' => '$did_usage_timestamp')));

But this is not working as expected.但这并没有按预期工作。

Even without knowing what $match and $project are doing in your original code, we can assume that $group is operating on a subset of the collection data since it is only preceded by $match in the pipeline.即使不知道$match$project在您的原始代码中做了什么,我们也可以假设$group正在对集合数据的一个子集进行操作,因为它只是在管道中以$match开头。 Based on only the example collection data and the $group stage, it's evident that the $dig_usage_timestamp field path used with the $min operator references a non-existent field in the documents entering the $group stage.仅基于示例集合数据和$group阶段,很明显,与$min运算符一起使用的$dig_usage_timestamp 字段路径引用了进入$group阶段的文档中不存在的字段。

When testing this locally, did_usage_timestamp in the randomly selected output document is null :在本地测试时,随机选择的输出文档中的did_usage_timestampnull

<?php

require 'vendor/autoload.php';

$client = new MongoDB\Client;
$collection = $client->test->foo;

$collection->drop();
$collection->insertMany([
    ["did" => "123456", "did_usage" => "1", "did_timestamp" => "15012"],
    ["did" => "4567811", "did_usage" => "1", "did_timestamp" => "15013"],
    ["did" => "46465464", "did_usage" => "2", "did_timestamp" => "15014"],
    ["did" => "7894446", "did_usage" => "2", "did_timestamp" => "15015"],
    ["did" => "65646131", "did_usage" => "3", "did_timestamp" => "15016"],
    ["did" => "7989464", "did_usage" => "2", "did_timestamp" => "15017"],
]);

$cursor = $collection->aggregate([
    ['$group' => ['_id' => '$did_usage', 'did_timestamp' => ['$min' => '$did_usage_timestamp']]],
    ['$sample' => ['size' => 1]],
]);

var_dump($cursor->toArray());

This outputs something similar to:这输出类似于:

array(1) {
  [0]=>
  object(MongoDB\Model\BSONDocument)#14 (1) {
    ["storage":"ArrayObject":private]=>
    array(2) {
      ["_id"]=>
      string(1) "1"
      ["did_timestamp"]=>
      NULL
    }
  }
}

Changing the $min operator's field path to $did_timestamp should resolve the issue.$min运算符的字段路径更改为$did_timestamp应该可以解决该问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM