简体   繁体   English

将MySQL查询转换为mongoDB

[英]Convert MySQL query to mongoDB

I have started learning MongoDB and stuck with an problem. 我已经开始学习MongoDB并遇到了问题。 I have a collection names as server_logs. 我有一个集合名称为server_logs。

It contains the below columns (SOURCE_SERVER, SOURCE_PORT, DESTINATION_PORT, DESTINATION_SERVER, MBYTES). 它包含以下列(SOURCE_SERVER,SOURCE_PORT,DESTINATION_PORT,DESTINATION_SERVER,MBYTES)。

I need the SOURCE_SERVER with the total amount to MBYTES transferred to each SOURCE_SERVER.(But here is one more point is that if any source_server is exist is also exist in target_server then their MBYTES will also added in each SOURCE_SERVER). 我需要将SOURCE_SERVER与MBYTES的总金额转移到每个SOURCE_SERVER。(但这里还有一点是,如果存在任何source_server,target_server中也存在,那么它们的MBYTES也会在每个SOURCE_SERVER中添加)。

For example : I have below table structure 例如:我有下表结构

  SOURCE   S_PORT   DEST    D_PORT  MBYTES
1)server1   446    server2   555     10MB
2)server3   226    server1   666     2MB
3)server1   446    server3   226     5MB

I need below result: 我需要以下结果:

Server1  17MB
Server3  7MB

I have created an query in mysql to calculate to top SOURCE as per MBYTES of data transferred to that SOURCE. 我已经在mysql中创建了一个查询,根据传输到该SOURCE的数据的MBYTES计算到最高SOURCE。 It is working fine and i am getting required results in MYSQL through this query. 它工作正常,我通过此查询获得MYSQL所需的结果。

SELECT SOURCE, DEST, sum( logs.MBYTES )+(
    SELECT SUM(log.MBYTES) as sum
    from logs as log
    where logs.DEST=log.SOURCE
) AS MBYTES

I want this query in MongoDB. 我想在MongoDB中使用此查询。 Please help.. 请帮忙..

Thanks in advance.. 提前致谢..

Though this sort of "self join" type of query might not seem immediately apparent to how you would do this with MongoDB, it can be done with the aggregation framework but just requires a little change in your thinking. 虽然这种“自联接”类型的查询对于如何使用MongoDB看起来似乎并不明显,但它可以通过聚合框架完成,但只需要稍微改变一下你的想法。

With your data in MongoDB in this form, which is still very much like the original SQL source: 使用这种形式的MongoDB中的数据,它仍然非常像原始的SQL源:

{ 
    "source" : "server1",
    "s_port" : 446,
    "dest" : "server2", 
    "d_port" : 555, 
    "transferMB" : 10
},
{ 
    "source" : "server3",
    "s_port" : 226,
    "dest" : "server1",
    "d_port" : 666,
    "transferMB" : 2
},
{ 
    "source" : "server1",
    "s_port" : 446, 
    "dest" : "server3",
    "d_port" : 226,
    "transferMB" : 5
}

Working with a pre 2.6 version of MongoDB your query will look like this: 使用2.6版本的MongoDB,您的查询将如下所示:

db.logs.aggregate([

    // Project a "type" tag in order to transform, then unwind
    { "$project": {
         "source": 1,
         "dest": 1,
         "transferMB": 1,
         "type": { "$cond": [ 1,[ "source", "dest" ],0] }
    }},
    { "$unwind": "$type" },

    // Map the "source" and "dest" servers onto the type, keep the source       
    { "$project": {
        "type": 1,
        "tag": { "$cond": [
            { "$eq": [ "$type", "source" ] },
            "$source",
            "$dest"
        ]},
        "mbytes": "$transferMB",
        "source": 1
    }},

    // Group for totals, keep an array of the "source" for each
    { "$group": {
        "_id": "$tag",
        "mbytes": { "$sum": "$mbytes" },
        "source": { "$addToSet": "$source" }
    }},


    // Unwind that array
    { "$unwind": "$source" },

    // Is our grouped tag one on the sources? Inner join simulate
    { "$project": {
        "mbytes": 1,
        "matched": { "$eq": [ "$source", "$_id" ] }
    }},

    // Filter the results that did not match
    { "$match": { "matched": true }},


    // Discard duplicates for each server tag
    { "$group": { 
        "_id": "$_id",
        "mbytes": { "$first": "$mbytes" }
    }}
])

For versions 2.6 and above, you get a few additional operators to streamline this, or a least makes use of different operators: 对于2.6及更高版本,您可以使用一些额外的运算符来简化此操作,或者至少使用不同的运算符:

db.logs.aggregate([

    // Project a "type" tag in order to transform, then unwind
    { "$project": {
         "source": 1,
         "dest": 1,
         "transferMB": 1,
         "type": { "$literal": [ "source", "dest" ] }
    }},
    { "$unwind": "$type" },

    // Map the "source" and "dest" servers onto the type, keep the source       
    { "$project": {
        "type": 1,
        "tag": { "$cond": [
            { "$eq": [ "$type", "source" ] },
            "$source",
            "$dest"
        ]},
        "mbytes": "$transferMB",
        "source": 1
    }},

    // Group for totals, keep an array of the "source" for each
    { "$group": {
        "_id": "$tag",
        "mbytes": { "$sum": "$mbytes" },
        "source": { "$addToSet": "$source" }
    }},

    // Co-erce the server tag into an array ( of one element )
    { "$group": {
        "_id": "$_id",
        "mbytes": { "$first": "$mbytes" },
        "source": { "$first": "$source" },
        "tags": { "$push": "$_id" }
    }},

    // User set intersection to find common element count of arrays
    { "$project": {
       "mbytes": 1,
       "matched": { "$size": { 
           "$setIntersection": [
               "$source",
               "$tags"
           ]
       }}
    }},

    // Filter those that had nothing in common
    { "$match": { "matched": { "$gt": 0 } }},

    // Remove the un-required field
    { "$project": { "mbytes": 1 }}
])

Both forms produce the results: 两种形式都会产生结果:

{ "_id" : "server1", "mbytes" : 17 }
{ "_id" : "server3", "mbytes" : 7 }

The general principle in both is that by keeping a list of the valid "source" servers you can then "filter" the combined results so that only those that were listed as a source will have their total transfer recorded. 两者的一般原则是,通过保留有效“源”服务器的列表,您可以“过滤”组合结果,以便只列出作为源的那些将记录其总传输。

So there are a couple of techniques you can use to "re-shape", "combine" and "filter" your documents to get your desired result. 因此,您可以使用几种技术来“重新塑造”,“组合”和“过滤”您的文档以获得所需的结果。

Read up more on the aggregation operators and also worth looking at for an introduction is the SQL to Aggregation mapping chart within the documentation to give you some idea of converting common operations. 阅读有关聚合运算符的更多信息,同时值得一看的是文档中的SQL to Aggregation映射图表 ,以便您了解转换常见操作。

Even browse tags here on Stack Overflow to find some interesting transformation operations. 甚至可以在Stack Overflow上浏览标签,以找到一些有趣的转换操作。

You can use aggregation framework for this: 您可以使用聚合框架:

db.logs.aggregate([
    {$group:{_id:"$SOURCE",MBYTES:{$sum:"$MBYTES"}}}
])

Assume that You have only numer values in MBYTES field. 假设您在MBYTES字段中只有MBYTES值。 So as result You will have: 因此,您将拥有:

{
    _id: server1,
    MBYTES: 17
},
{
    _id: server3,
    MBYTES: 7
}

In case You have to count this also for server appears in DEST field You should use map-reduce method: 万一你必须计算这个也为服务器出现在DEST字段你应该使用map-reduce方法:

var mapF = function(){
    emit(this.SOURCE,this.MBYTES);
    emit(this.DEST,this.MBYTES);
}

var reduceF = function(serverId,mbytesValues){
    var reduced = {
        server: serverId,
        mbytes: 0
    };

    mbytesValues.forEach(function(value) {
        reduced.mbytes += value;
    });

    return reduced;
}

db.logs.mapReduce(mapF,reduceF,{out:"server_stats"});

After that You can find results in server_stats collection. 之后您可以在server_stats集合中找到结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM