简体   繁体   中英

MongoDB, PHP getting unique visitors per day

I'm creating some analytics script using PHP and MongoDB and I am a bit stuck. I would like to get the unique number of visitors per day within a certain time frame.

{
   "_id": ObjectId("523768039b7e7a1505000000"),
   "ipAddress": "127.0.0.1",
   "pageId": ObjectId("522f80f59b7e7a0f2b000000"),
   "uniqueVisitorId": "0445905a-4015-4b70-a8ef-b339ab7836f1",
   "recordedTime": ISODate("2013-09-16T20:20:19.0Z") 
}

The field to filter on is uniqueVisitorId and recordedTime.

I've created a database object in PHP that I initialise and it makes me a database connection when the object is constructed, then I have MongoDB php functions simply mapped to public function using the database connection created on object construction.

Anyhow, so far I get the number of visitors per day with:

public function GetUniqueVisitorsDiagram() {
    // MAP
    $map = new MongoCode('function() {
        day = new Date(Date.UTC(this.recordedTime.getFullYear(), this.recordedTime.getMonth(), this.recordedTime.getDate()));
        emit({day: day, uniqueVisitorId:this.uniqueVisitorId},{count:1});
    }');
    // REDUCE
    $reduce = new MongoCode("function(key, values) {
            var count = 0;
            values.forEach(function(v) {
                count += v['count'];
            });
            return {count: count};
        }");
    // STATS
    $stats = $this->database->Command(array(
        'mapreduce' => 'statistics',
        'map' => $map,
        'reduce' => $reduce,
        "query" => array(
            "recordedTime" =>
                array(
                    '$gte' => $this->startDate,
                    '$lte' => $this->endDate
                )
            ),
            "out" => array(
                "inline" => 1
            )
        ));
    return $stats;          
}

How would I filter this data correctly to get unique visitors? Or would it better to use aggregation, if so could you be so kind to help me out with a code snippet?

The $group operator in the aggregation framework was designed for exactly this use case and will likely be ~10 to 100 times faster. Read up on the group operator here: http://docs.mongodb.org/manual/reference/aggregation/group/

And the php driver implementation here: http://php.net/manual/en/mongocollection.aggregate.php

You can combine the $group operator with other operators to further limit your aggregations. It's probably best you do some reading up on the framework yourself to better understand what's happening, so I'm not going to post a complete example for you.

    $m=new MongoClient();
    $db=$m->super_test;
    $db->gjgjgjg->insert(array(
        "ipAddress" => "127.0.0.1",
        "pageId" => new MongoId("522f80f59b7e7a0f2b000000"),
        "uniqueVisitorId" => "0445905a-4015-4b70-a8ef-b339ab7836f1",
        "recordedTime" => new MongoDate(strtotime("2013-09-16T20:20:19.0Z"))
    ));
    var_dump($db->gjgjgjg->find(array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week')))))->count()); // Prints 1
    $res=$db->gjgjgjg->aggregate(array(
            array('$match'=>array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week'))),'uniqueVisitorId'=>array('$ne'=>null))),
            array('$project'=>array('day'=>array('$dayOfMonth'=>'$recordedTime'),'month'=>array('$month'=>'$recordedTime'),'year'=>array('$year'=>'$recordedTime'))),
            array('$group'=>array('_id'=>array('day'=>'$day','month'=>'$month','year'=>'$year'), 'c'=>array('$sum'=>1)))
    )); 
    var_dump($res['result']);

To answer the question entirely:

    $m=new MongoClient();
    $db=$m->super_test;
    $db->gjgjgjg->insert(array(
        "ipAddress" => "127.0.0.1",
        "pageId" => new MongoId("522f80f59b7e7a0f2b000000"),
        "uniqueVisitorId" => "0445905a-4015-4b70-a8ef-b339ab7836f1",
        "recordedTime" => new MongoDate(strtotime("2013-09-16T20:20:19.0Z"))
    ));
    var_dump($db->gjgjgjg->find(array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week')))))->count()); // Prints 1
    $res=$db->gjgjgjg->aggregate(array(
            array('$match'=>array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week'))),'uniqueVisitorId'=>array('$ne'=>null))),
            array('$project'=>array('day'=>array('$dayOfMonth'=>'$recordedTime'),'month'=>array('$month'=>'$recordedTime'),'year'=>array('$year'=>'$recordedTime'))),
            array('$group'=>array('_id'=>array('day'=>'$day','month'=>'$month','year'=>'$year','v'=>'$uniqueVisitorId'), 'c'=>array('$sum'=>1))),
            array('$group'=>array('_id'=>array('day'=>'$_id.day','month'=>'$_id.month','year'=>'$_id.year'),'c'=>array('$sum'=>1)))
    )); 
    var_dump($res['result']);

Something close to that is what your looking for I believe.

It will reutrn a set of documents that have the _id as the date and then the count of unique visitors for that day irresptive of the of the id, it simply detects only if the id is there.

Since you want it per day you can actually exchange the dat parts for just one field of $dayOfYear I reckon.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM