简体   繁体   English

MySql:使用'view'是合理的还是我会更好地对我的数据库进行非规范化?

[英]MySql: Is it reasonable to use 'view' or I would better denormalize my DB?

There is 'team_sector' table with following fields: Id, team_id, sect_id, size, level 有'team_sector'表,其中包含以下字段:Id,team_id,sect_id,size,level

It contains few records for each 'team' entity (referenced with 'team_id' field). 它包含每个“团队”实体的少量记录(以“team_id”字段引用)。 Each record represent sector of team's stadium (totally 8 sectors). 每条记录代表球队体育场的部门(共8个部门)。

Now it is necessary to implement few searches: 现在有必要实施一些搜索:

  • by overall stadium size (SUM(size)); 按体育场总面积(SUM(size));
  • the best quality (SUM(level)/COUNT(*)). 最好的质量(SUM(等级)/ COUNT(*))。

I could create query something like this: 我可以创建这样的查询:

SELECT TS.team_id, SUM(TS.size) as OverallSize, SUM(TS.Level)/COUNT(TS.Id) AS QualityLevel
FROM team_sector
GROUP BY team_id
ORDER BY OverallSize DESC / ORDER BY QualityLevel DESC

But my concern here is that calculation for each team will be done each time on query performed. 但我担心的是,每次执行查询时都会对每个团队进行计算。 It is not too big overhead (at least now), but I would like to avoid performance issues later. 这不是太大的开销(至少现在),但我想稍后避免性能问题。

I see 2 options here. 我在这里看到2个选项。

The 1st one is to create 2 additional fields in 'team' table (for example) and store there OverallSize and QualityLevel fields. 第一个是在'team'表中创建2个附加字段(例如)并存储OverallSize和QualityLevel字段。 If information if 'sector' table is changed - update those table too (probably would be good to do that with triggers, as sector table doesn't change too often). 如果'扇区'表的信息被更改 - 也更新那些表(使用触发器可能会很好,因为扇区表不会经常更改)。

The 2nd option is to create a view that will provide required data. 第二个选项是创建一个提供所需数据的视图。

The 2nd option seems much easier for me, but I don't have a lot of experience/knowledge of work with views. 第二个选项对我来说似乎更容易,但我没有很多关于视图工作的经验/知识。

Q1: What is the best option from your perspective here and why? Q1:从你的角度来看,最好的选择是什么?为什么? Probably you could suggest other options? 可能你可以建议其他选择?

Q2: Can I create view in such way that it will do calculations rarely (at least once per day)? Q2:我能否以很少进行计算的方式创建视图(至少每天一次)? If yes - how? 如果是 - 如何?

Q3: Is it reasonable to use triggers for such purpose (1st option). 问题3:为此目的使用触发器是否合理(第一选项)。

PS MySql 5.1 is used, overall number of teams is around 1-2 thousand, overall number of records in sector table - overall 6-8 thousand. 使用PS MySql 5.1,团队总数约为1-2千,扇区表中的总记录数 - 总计6-8千。 I understand, those numbers are pretty small, but I would like to implement the best practice here. 我明白,这些数字很小,但我想在这里实施最佳实践。

I wouldn't add calculated fields to your source tables. 我不会将计算字段添加到源表。 Keep your source data separate from calculated data by using temporary tables instead. 通过使用临时表将源数据与计算数据分开。 You can use a one-to-one mapping identified by shared PK to increase performance by reducing indexes and such (so the PK of the source rows equals the PK of the rows in the calculated table). 您可以使用由共享PK标识的一对一映射来通过减少索引等来提高性能(因此源行的PK等于计算表中行的PK)。

The upside is when you rebuild the DB, it's clear that the calculated data is stale by the absence of the tables. 好处是当你重建数据库时,很明显,由于没有表,计算出的数据是陈旧的。 It also allows shortcuts such as clearing all the calculated data by simply dropping the temp tables, for instance by a cron job. 它还允许通过简单地删除临时表来清除所有计算数据的快捷方式,例如通过cron作业。 In that manner, the calculated data rows might also keep a timestamp of when the data was calculated. 以这种方式,计算的数据行还可以保持计算数据的时间戳。 In that manner, if the max cache period was expired, the calculated data could be recalculated on the fly, as it is loaded, or as a batch at night, when the servers are quiet. 以这种方式,如果最大缓存时段已到期,则计算的数据可以在加载时即时重新计算,或者在服务器安静时在夜间批量重新计算。

A few (ten)thousand records are nothing you should be worried about. 几十(千)千条记录是你不应该担心的。

Best practices are 最佳做法是

  • store data in a normalized fashion and let the database engine handle calculations 以标准化方式存储数据,并让数据库引擎处理计算
  • index your data properly, do an index maintenance now and then 正确索引数据,偶尔进行索引维护
  • avoid storing aggregated values with "parent" records 避免使用“父”记录存储聚合值
  • do some result caching in the application layer to avoid hitting the DB server more often than necessary 在应用程序层中执行一些结果缓存,以避免更频繁地访问数据库服务器
  • deal with performance issues when you get them 当你得到它们时处理性能问题

Yes, the database will calculate the SUM() whenever the view/query is executed, but I would expect results to be pretty instant for the scenario you describe. 是的,每当执行视图/查询时,数据库都会计算SUM() ,但我希望您描述的场景的结果非常紧凑。

If you encounter a really complicated view that takes a long time to calculate and and you cannot find any way to optimize your tables any further, you can introduce a helper table that is filled with the view results regularly (or via triggers) and and query that table instead of the slow view. 如果您遇到一个非常复杂的视图,需要花费很长时间来计算,并且您无法找到任何进一步优化表的方法,您可以引入一个助手表,该表经常(或通过触发器)和查询填充视图结果那个表而不是慢视图。

IMHO, anticipating possible performance bottlenecks and "closing" them before they actually show up is wasting your time. 恕我直言,预计可能的性能瓶颈并在实际出现之前“关闭”它们是在浪费你的时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM