简体   繁体   English

分析系统-用于存储统计信息的数据库设计或架构

[英]analysis system - database design or schema for storing statistics

I have different models generating stats about them: 我有不同的模型来生成有关它们的统计信息:

Model        Example stats
-----        -------------
User         qty_logins... qty_toys... qty_friends
Group        qty_users... qty_invites...
Section      qty_visits

So all stats data are going to a MySQL stats table with this structure: 因此,所有统计数据都将发送到具有以下结构的MySQL 统计表

model_id    kind       name           value
-----       ----       ----           -----
123         user       qty_logins     5
123         user       qty_toys       14
456         group      qty_invites    21
789         section    qty_visits     23

currently I have 100k rows and performance is ok. 目前我有10万行,性能还可以。

question 1) is this a good way to store stats data? 问题1)这是存储统计数据的好方法吗? or should I separate in different tables (one for each model kind, for example). 或者我应该在不同的表中分开(例如,每种模型都一个)。

question 2) I'm trying to implement dynamic results generation, for example qty_logins + qty_visits . 问题2)我正在尝试实现动态结果生成,例如qty_logins + qty_visits The problem is auto-updating this every time one data changes. 问题是,每当一个数据更改时,它都会自动更新。 Is there any kind of database with dynamic data generation or other any tool can help doing this real-time? 是否存在具有动态数据生成功能的任何数据库或其他任何工具可以帮助实现这一实时性?

Your schema is fine, assuming that the values are all numeric (which is reasonable for statistical values). 假设值均为数字(对于统计值而言是合理的),则您的架构很好。

This structure is called entity-value-attribute (EVA) models. 这种结构称为实体-价值-属性(EVA)模型。 These store each value on a separate line. 这些将每个值存储在单独的行中。 In general, they are not the best way to store data. 通常,它们不是存储数据的最佳方法。 However, in this case, you have a flexible number of statistics on a variety of tables. 但是,在这种情况下,您可以灵活地对各种表进行统计。 And both might change over time. 而且两者都可能随着时间而改变。 So, it seems like a reasonable application. 因此,这似乎是一个合理的应用程序。

You can probably increase performance of your queries with appropriate indexing. 您可以通过适当的索引来提高查询的性能。 Without seeing the queries, the right approach is speculative. 没有看到查询,正确的方法是推测性的。

Question (2) is rather difficult. 问题(2)相当困难。 It is not to hard for your example, but if you want to support hierarchical expressions, it will get complicated (that is, expressions based on other expressions). 对于您的示例而言,这并不难,但是如果您要支持分层表达式,它将变得很复杂(即,基于其他表达式的表达式)。 For your example, you have three basic options: 对于您的示例,您有三个基本选项:

  • You can use triggers to update values. 您可以使用触发器来更新值。 You have to have additional columns or another table specifying the relationships. 您必须具有其他列或另一个表来指定关系。
  • You can use views to retrieve the values, doing the calculation when you fetch the results. 您可以使用视图来检索值,并在获取结果时进行计算。
  • You can use stored procedures for all changes to the data, and put the logic in the stored procedure. 您可以将存储过程用于所有对数据的更改,并将逻辑放入存储过程中。

The second option would be my first approach. 第二种选择是我的第一种方法。 The third would then be my preference. 那么第三个将是我的偏好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM