简体   繁体   English

存储/计算用户分数的最佳方法是什么?

[英]What's the best way to store/calculate user scores?

I am looking to design a database for a website where users will be able to gain points (reputation) for performing certain activities and am struggling with the database design. 我希望为一个网站设计一个数据库,用户可以通过该数据库获得执行某些活动的积分(声誉),并且正在努力应对数据库设计。

I am planning to keep records of the things a user does so they may have 25 points for an item they have submitted, 1 point each for 30 comments they have made and another 10 bonus points for being awesome! 我打算保留用户所做事情的记录,这样他们可能会为他们提交的项目获得25分,每个他们所做的30条评论获得1分,还有10个奖励积分非常棒!

Clearly all the data will be there, but it seems like a lot or querying to get the total score for each user which I would like to display next to their username (in the form of a level). 很明显,所有的数据都会存在,但似乎很多或者要查询每个用户的总得分,我想在他们的用户名旁边显示(以一个级别的形式)。 For example, a query to the submitted items table to get the scores for each item from that user, a query to the comments table etc. If all this needs to be done for every user mentioned on a page.... LOTS of queries! 例如,对提交的项目表的查询以获取该用户的每个项目的分数,对评论表的查询等。如果需要对页面上提到的每个用户完成所有这些....大量的查询!

I had considered keeping a score in the user table, which would seem a lot quicker to look up, but I've had it drummed into me that storing data that can be calculated from other data is BAD! 我曾经考虑过在用户表中保留一个分数,这看起来要快得多,但是我已经知道存储可以从其他数据计算的数据是很糟糕的!

I've seen a lot of sites that do similar things (even stack overflow does similar) so I figure there must be a "best practice" to follow. 我已经看到很多网站做类似的事情(甚至堆栈溢出也类似)所以我认为必须有一个“最佳实践”。 Can anyone suggest what it may be? 任何人都可以建议它可能是什么?

Any suggestions or comments would be great. 任何建议或意见都会很棒。 Thanks! 谢谢!

I think that this is definitely a great question. 我认为这绝对是一个很好的问题。 I've had to build systems that have similar behavior to this--especially when the table with the scores in it is accessed pretty often (like in your scenario). 我必须构建与此类似行为的系统 - 尤其是当经常访问具有分数的表时(如在您的场景中)。 Here's my suggestion to you: 这是我对你的建议:

First, create some tables like the following (I'm using SQL Server best practices, but name them however you see fit): 首先,创建一些如下所示的表(我使用的是SQL Server最佳实践,但是如果您认为合适,请将它们命名):

UserAccount          UserAchievement
 -Guid (PK)           -Guid (PK)
 -FirstName           -UserAccountGuid (FK)
 -LastName            -Name
 -EmailAddress        -Score

Once you've done this, go ahead and create a view that looks something like the following (no, I haven't verified this SQL, but it should be a good start): 完成此操作后,继续创建一个类似于以下内容的视图(不,我没有验证过这个SQL,但它应该是一个好的开始):

SELECT [UserAccount].[FirstName]      AS FirstName,
       [UserAccount].[LastName]       AS LastName,
       SUM([UserAchievement].[Score]) AS TotalPoints
FROM [UserAccount]
INNER JOIN [UserAchievement]
     ON [UserAccount].[Guid] = [UserAchievement].[UserAccountGuid]
GROUP BY [UserAccount].[FirstName],
         [UserAccount].[LastName]
ORDER BY [UserAccount].[LastName] ASC

I know you've mentioned some concern about performance and a lot of queries, but if you build out a view like this, you won't ever need more than one. 我知道你已经提到了一些关于性能和大量查询的问题,但是如果你构建一个这样的视图,你将不需要多个。 I recommend not making this a materialized view; 我建议不要将其视为物化视图; instead, just index your tables so that the lookups that you need (essentially, UserAccountGuid) will enable fast summation across the table. 相反,只需索引您的表,以便您需要的查找(实际上是UserAccountGuid)将实现表中的快速求和。

I will add one more point--if your UserAccount table gets huge, you may consider a slightly more intelligent query that would incorporate the names of the accounts you need to get roll-ups for. 我将再添加一点 - 如果您的UserAccount表变得庞大,您可以考虑一个稍微更智能的查询,其中包含您需要进行汇总的帐户名称。 This will make it possible not to return huge data sets to your web site when you're only showing, you know, 3-10 users' information on the page. 这样,当您只在页面上显示3-10个用户的信息时,就不会将大量数据集返回到您的网站。 I'd have to think a bit more about how to do this elegantly, but I'd suggest staying away from "IN" statements since this will invoke a linear search of the table. 我不得不考虑更多关于如何优雅地做到这一点,但我建议远离“IN”语句,因为这将调用表的线性搜索。

For very high read/write ratios, denormalizing is a very valid option. 对于非常高的读/写比率,非规范化是一个非常有效的选择。 You can use an indexed view and the data will be kept in sync declaratively (so you never have to worry about there being bad score data). 您可以使用索引视图,并且数据将以声明方式保持同步(因此您永远不必担心存在错误的分数数据)。 The downside is that it IS kept in sync.. so the updates to the store total are a synchronous aspect of committing the score action. 缺点是它保持同步..所以商店总数的更新是提交分数动作的同步方面。 This would normally be quite fast, but it is a design decision. 这通常会很快,但这是一个设计决定。 If you denormalize yourself, you can choose if you want to have some kind of delayed update system. 如果您对自己进行了自我规范化,则可以选择是否要使用某种延迟更新系统。

Personally I would go with an indexed view for starting, and then later you can replace it fairly seamlessly with a concrete table if your needs dictate. 就个人而言,我会使用索引视图进行启动,然后如果您的需要,您可以使用具体表格无缝地替换它。

In the past we've always used some sort of nightly or perodic cron job to calculate the current score and save it in the database - sort of like a persistent view of the SUM on the activities table. 在过去,我们总是使用某种夜间或周期性cron作业来计算当前分数并将其保存在数据库中 - 有点像活动表上SUM的持久视图。 Like most "best practices" they are simply guidelines and it's often better and more practical to deviate from a specific hard nosed practice on very specific areas. 像大多数“最佳实践”一样,它们只是指导方针,在特定领域偏离特定的强硬实践往往更好,更实际。

Plus it's not really all that much of a deviation if you use the cron job as it's better viewed as a cache stored in the database. 此外,如果您使用cron作业,它并不是真正的偏差,因为它更好地被视为存储在数据库中的缓存。

If you have a separate scores table, you could update it each time an item is submitted or a comment is posted by a user. 如果您有单独的分数表,则可以在每次提交项目或用户发布评论时更新它。 You could do this using a trigger or within the sites code. 您可以使用触发器或站点代码执行此操作。

The user scores would be updated continuously, and could be quickly queried for display. 用户分数将不断更新,并可快速查询显示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM