简体   繁体   English

有关Ruby on Rails,常量,belongs_to和数据库优化/性能的问题

[英]Question about Ruby on Rails, Constants, belongs_to & Database Optimization/Performance

I've developed a web based point of sale system for one of my clients in Ruby on Rails with MySQL backend. 我已经为使用MySQL后端的Ruby on Rails中的一个客户开发了基于Web的销售点系统。 These guys are growing so fast that they are ringing close to 10,000 transactions per day corporate-wide. 这些家伙的成长如此之快,以至于整个公司每天要处理近10,000笔交易。 For this question, I will use the transactions table as an example. 对于这个问题,我将使用事务表作为示例。 Currently, I store the transactions.status as a string (ie: 'pending', 'completed', 'incomplete') within a varchar(255) field that has an index. 当前,我将transactions.status作为字符串(即:“ pending”,“ completed”,“ incomplete”)存储在具有索引的varchar(255)字段中。 In the beginning, it was fine when I was trying to lookup records by different statuses as I didn't have to worry about so many records. 一开始,当我尝试按不同状态查找记录时很好,因为我不必担心那么多记录。 Over time, using the query analyzer, I have noticed that performance has worsened and that varchar fields can really slowdown your query speed over thousands of lookups. 随着时间的流逝,使用查询分析器后,我注意到性能有所下降,而varchar字段在数千次查找中实际上会降低查询速度。 I've been thinking about converting these varchar fields to integer based status fields utilizing STATUS CONSTANT within the Transaction model like so: 我一直在考虑使用Transaction模型中的STATUS CONSTANT将这些varchar字段转换为基于整数的状态字段,如下所示:

class Transaction < ActiveRecord::Base
  STATUS = { :incomplete => 0, :pending => 1, :completed => 2 }

  def expensive_query_by_status(status)
    self.find(:all,
              :select => "id, cashier, total, status",
              :condition => { :status => STATUS[status.to_sym] })
end

Is this the best route for me to take? 这是我最好的选择吗? What do you guys suggest? 你们有什么建议? I am already using proper indexes on various lookup fields and memcached for query caching wherever possible. 我已经在各种查找字段上使用了适当的索引,并在可能的情况下将memcached用于查询缓存。 They're currently setup on a distributed server environment of 3 servers where 1st is for application, 2nd for DB & 3rd for caching (all in 1 datacenter & on same VLAN). 目前,它们是在3台服务器的分布式服务器环境中设置的,其中第一台用于应用程序,第二台用于数据库,第三台用于缓存(全部位于1个数据中心中,且位于同一VLAN上)。

Have you tried the alternative on a representative database? 您是否在代表性数据库上尝试过替代方法? From the example given, I'm a little sceptical that it's going to make much difference, you see. 从给出的示例中,我有点怀疑这会带来很大的不同。 If there are only three statuses then a query by status may be better-off not using an index at all. 如果只有三个状态,那么按状态查询可能会更好,根本不使用索引。

Say "completed" comprises 80% of your table - with no other indexed column involved, you're going to be requiring more reads if the index is used than not. 假设“已完成”占表的80%-不涉及其他索引列,那么如果使用索引,您将需要更多读取。 So a query of that type is almost certainly going to get slower as the table grows. 因此,随着表的增长,这种类型的查询几乎肯定会变慢。 "incomplete" and "pending" queries would probably still benefit from an index, however; 但是,“不完整”和“待定”查询仍可能会受益于索引。 they'd only be affected as the total number of rows with those statuses grew. 只有这些状态的行总数增加时,它们才会受到影响。

How often do you look at everything, complete and otherwise, without some more selective criterion? 在没有更多选择标准的情况下,您多久查看一次所有内容(包括完整的内容)? Could you partition the table in some (internal or external) way? 您能以某种方式(内部或外部)对表进行分区吗? For example, store completed transactions in a separate table, moving new ones there as they reach their final (?) state. 例如,将完成的事务存储在单独的表中,当新事务到达其最终(?)状态时,将新事务移到该表中。 I think internal database partitioning was introduced in MySQL 5.1 - looking at the documentation it seems that a RANGE partition might be appropriate. 我认为内部数据库分区是MySQL 5.1中引入的-查看文档 ,看来RANGE分区可能是合适的。

All that said, I do think there's probably some benefit to moving away from storing statuses as strings. 综上所述,我确实认为摆脱将状态存储为字符串可能会有一些好处。 Storage and bandwidth considerations aside, it's a lot less likely that you'll inadvertently mis-spell an integer or, better yet, a constant or symbol. 除了存储和带宽方面的考虑之外,您很可能会无意间错误地拼写了一个整数,或者更好地是一个常数或符号。

You might want to start limiting your searchings (if your not doing that already), #find(:all) is pretty taxing on that scale. 您可能想开始限制搜索(如果您还没有这样做的话), #find(:all)在这个范围上#find(:all) Also you might want to think about what your Transaction model is reaching out for as it gets translated into your views and perhaps eager load those to minimize requests to the db for extra information. 另外,您可能想考虑事务模型在转换为视图时所要达到的目标,并可能渴望加载这些模型以最小化对db的额外信息请求。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM