简体   繁体   English

Couchbase数据建模-面向文档

[英]Couchbase data modelling - Document oriented

This question is not necessarily Couchbase 2.0 develop preview specific however I think it may help people with investigation into the new Couchbase product. 这个问题不一定是Couchbase 2.0开发特定于预览版的,但是我认为它可以帮助人们研究新的Couchbase产品。

I am looking for advice on data modelling. 我正在寻找有关数据建模的建议。 We are investigating Couchbase with a view to possibly using it for Realtime Analytics. 我们正在研究Couchbase,以期可能将其用于实时分析。

However I cannot find any documentation on how best to model real world data. 但是,我找不到任何有关如何最好地模拟现实世界数据的文档。

I shall propose a scenario and if the community could help me or discuss some ideas for how this could be modelled it would be very useful? 我将提出一个方案,如果社区可以帮助我或讨论一些有关如何建模的想法,这将非常有用吗?

Note this is not representative of our product and I am not asking people to solve our modelling for us the question is more intended for discussion 请注意,这不代表我们的产品,我并不是在要求人们为我们解决建模问题,这个问题更适合讨论

Lets assume that customers make purchases of products at a particular date/time, products have information with them such as id, name, description and price, a purchase is performed on a date. 假设客户在特定日期/时间购买了产品,产品具有诸如ID,名称,描述和价格之类的信息,并且购买是在某个日期进行的。

The initial requirement is to be able to count all purchases between two dates. 最初的要求是能够计算两个日期之间的所有购买。 For any 1 day there might be over 100,000 purchases - this is a pretty big business ;) 对于任何一天,可能有超过100,000笔购物-这是一笔相当大的生意;)

If any of the syntax is incorrect please let me know - all advice/help is welcome. 如果任何语法不正确,请通知我-欢迎提供所有建议/帮助。

If we modelled the data something like so (which maybe completely incorrect): 如果我们像这样对数据建模(可能完全不正确):

Purchases with products 与产品一起购买

{
    "_id" : "purchase_1",
    "_rev" : "1-1212afdd126126128ae",
    "products" :  [
        "prod_1" : {
            "name" : "Milk",
            "desc" : "Semi-skimmed 1ltr",
            "price" : "0.89"
        },
        "prod_7568" : {
            "name" : "Crisps", 
            "desc" : "Salt and Vinegar",
            "price: "0.85"
        }
    ]
    "date" : "2012-01-14 14:24:33"
}

{
    "_id" : "purchase_2",
    "_rev" : "1-1212afdd126126128ae",
    "products" :  [
        "prod_89001" : {
            "name" : "Bread", 
            "desc" : "White thick sliced",
            "price: "1.20"
        }
    ]
    "date" : "2012-01-14 15:35:59"
}

So given that document layout we can see each purchase and we can see the products that were in that purchase - however how could we go about counting all the purchases between two dates? 因此,考虑到文档布局,我们可以看到每次购买,也可以看到该购买中的产品-但是,如何计算两个日期之间的所有购买呢? Also how could you see a log of all the purchases between two dates in date descending order? 另外,您如何查看两个日期之间降序排列的所有购买的日志?

Is this something Couchbase is suited for? 这是Couchbase适合的东西吗?

There might be hundreds of thousands of purchases between two dates and the customer doesn't like to wait for reports….as I'm sure everyone has experienced ;) 在两个日期之间可能有成千上万的购买,并且客户不喜欢等待报告…。因为我确定每个人都经历过;)

Would it be best to use the incr functions and if so how would you go about modelling the data? 最好使用incr函数,如果是,您将如何对数据建模?

Many thanks to anyone that reads this - I hope to expland on this further giving more examples of real world modelling problems if possible. 非常感谢所有读过此书的人-我希望对此做进一步的说明,并尽可能给出更多真实世界中建模问题的示例。

James 詹姆士

In the simplest case you could write a Map function that would create a view using the date field as a key. 在最简单的情况下,您可以编写一个Map函数,该函数将使用日期字段作为键来创建视图。

So with a slightly modified document design: 因此,使用略微修改的文档设计:

{
   "_id": "purchase_1",
   "_rev": "2-c09e24efaffd446c6ee8ed6a6e2b4a22",
   "products": [
       {
           "id": "prod_3",
           "name": "Bread",
           "desc": "Whole wheat high fiber",
           "price": 2.99
       }
   ],
   "date": "2012-01-15 12:34:56"
}

{
   "_id": "purchase_2",
   "_rev": "2-3a7f4e4e5907d2163d6684f97c45a715",
   "products": [
       {
           "id": "prod_1",
           "name": "Milk",
           "desc": "Semi-skimmed 1ltr",
           "price": 0.89
       },
       {
           "id": "prod_7568",
           "name": "Crisps",
           "desc": "Salt and Vinegar",
           "price": 0.85
       }
   ],
   "date": "2012-01-14 14:24:33"
}

Your map function would look like: 您的地图函数如下所示:

function(doc) {
  for (var product in doc.products) {
    emit(doc.date, doc.products[product].price);
  }  
}

You could optionally add a reduce function that would sum up purchases by date. 您可以选择添加减少功能,该功能可以按日期汇总购买。

function(keys, values) {
    return sum(values);
}

You could then query the view using the startkey and endkey parameters. 然后,您可以使用startkey和endkey参数查询视图。

http://localhost:5984/couchbase/_design/Products/_view/total_price_by_date?startkey="2012-01-01"&endkey="2012-01-31"&group=true

The output from querying the view would be: 查询视图的输出将是:

{"rows":[
{"key":"2012-01-14 14:24:33","value":4.94},
{"key":"2012-01-15 12:34:56","value":2.99}
]}

Or remove the group parameter to get the sum for the entire date range: 或删除组参数以获取整个日期范围内的总和:

{"rows":[
{"key":null,"value":7.930000000000001}
]}

Hope that helps. 希望能有所帮助。

-- John - 约翰

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM