[英]Use mapReduce on a 'logs' collection to generate HTTP stream
MongoDB新手问题:
我将许多HTTP日志存储到具有以下数据结构的集合中:
{
'client': {
'ip_address': '1.2.3.4',
'referrer':"http://....",
'user_agent':'Mozilla..."
},
'request':{
"stream": "stream1",
"method": "GET",
"fragment_id": 97,
"date": 13482181,
'response':{
'status':200,
'size': 654
}
}
每个文档都描述了一个HTTP请求(从客户端到内容流)。 由于每个流都分成较小的部分,因此我想在集合中使用“ mapReduce”,然后创建一个“通用流请求”文档,如下所示:
{
'client_ip': '1.2.3.4',
'user_agent': 'Mozilla',
'streams':[
{
'stream':"stream1",
'referrer':'http://...',
'requests':[
{
'fragment_id':97,
'status':200,
'date': 13482181,
'size': 654
...
},
{
'fragment_id':98,
'status':200,
'date': 13482192,
'size': 624
...
}, [...]
]
}, [...]
]
这是我尝试过的:
map = function(){
emit({client_ip:this.client.ip,user_agent:this.client.user_agent},{
stream:this.request.stream,
referrer:this.client.referer,
status:this.response.status,
date:this.request.date,
size:this.response.total_size,
fragment_id:this.request.fragment_infos[1]
});
}
reduce = function(key,values){
r = {'count':0,'request':[]};
values.forEach(function(v){
r.count += 1;
r.request.push(v);
});
return r;
}
但是这是我得到的结果:
"_id" : {
"client_ip" : "1.2.3.4",
"user_agent" : "Mozilla\/4.0"
},
"value" : {
"client_ip" : "1.2.3.4",
"user_agent" : "Mozilla\/4.0",
"count" : 17,
"request" : {
"0" : {
"client_ip" : "1.2.3.4",
"user_agent" : "Mozilla\/4.0",
"count" : 2,
"request" : {
"0" : {
"stream" : "stream1.isml",
"referrer" : null,
"status" : 200,
"date" : 1341706566,
"size" : 456,
"fragment_id" : null,
"count" : 1
},
"1" : {
"stream" : "stream1.isml",
"referrer" : null,
"status" : 200,
"date" : 1341706566,
"size" : null,
"fragment_id" : null,
"count" : 1
}
}
},
"1" : {
"client_ip" : "1.2.3.4",
"user_agent" : "Mozilla\/4.0",
"count" : 3,
"request" : {
"0" : {
"client_ip" : "1.2.3.4",
"user_agent" : "Mozilla\/4.0",
"count" : 2,
"request" : {
"0" : {
"stream" : "stream1.isml",
"referrer" : null,
"status" : 200,
"date" : 1341706568,
"size" : null,
"fragment_id" : null,
"count" : 1
.........
我哪里错了?
您将始终以包含_id和value的记录结尾,这是MongoDB map / reduce的属性。 有一张打开的票证可以更改此行为: https : //jira.mongodb.org/browse/SERVER-2517
至于使值与示例保持一致,则希望map函数的输出与reduce函数的输出具有相同的形式。
map = function(){
emit({client_ip:this.client.ip,user_agent:this.client.user_agent},{
client_ip: this.client.ip,
user_agent: this.client.user_agent,
streams: {
this.request.stream: {
referrer: this.client.referer,
requests: [
{
fragment_id: this.request.fragment_infos[1],
status:this.response.status,
date:this.request.date,
size:this.response.total_size
}
]
}
}
});
}
您需要修改reduce函数以合并此表单的多个文档。 如有必要,编写一个finalize函数,以将流的哈希转换为流的数组,并在每个元素内添加流名称。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.