棘手的 Rails3/mysql 查询

Question

In rails 3 (also with meta_where gem if you feel like using it in your query), I got a really tricky query that I have been banging my head for:在 rails 3（如果您想在查询中使用 meta_where gem，也可以使用它），我得到了一个非常棘手的查询，我一直在努力寻找：

Suppose I have two models, customers and purchases, customer have many purchases.假设我有两个模型，客户和采购，客户有很多采购。 Let's define customers with at least 2 purchases as "repeat_customer".让我们将至少有 2 次购买的客户定义为“repeat_customer”。 I need to find the total number of repeat_customers by each day for the past 3 months, something like:我需要查找过去 3 个月每天的重复客户总数，例如：

Date    TotalRepeatCustomerCount
1/1/11  10 (10 repeat customers by the end of 1/1/11)
1/2/11  15 (5 more customer gained "repeat" status on this date)
1/3/11  16 (1 more customer gained "repeat" status on this date)
...
3/30/11 150
3/31/11 160

Basically I need to group customer count based on the date of creation of their second purchase, since that is when they "gain repeat status".基本上我需要根据他们第二次购买的创建日期对客户数量进行分组，因为那是他们“获得重复状态”的时候。

Certainly this can be achieved in ruby, something like:当然，这可以在 ruby 中实现，例如：

Customer.includes(:purchases).all.select{|x| x.purchases.count >= 2 }.group_by{|x| x.purchases.second.created_at.to_date }.map{|date, customers| [date, customers.count]}

However, the above code will fire query on the same lines of Customer.all and Purchase.all , then do a bunch of calculation in ruby.但是，上面的代码会在Customer.all和Purchase.all的同一行上触发查询，然后在 ruby 中进行一堆计算。 I would much prefer doing selection, grouping and calculations in mysql, since it is not only much faster, it also reduces the bandwith from the database.我更喜欢在 mysql 中进行选择、分组和计算，因为它不仅速度更快，而且还减少了数据库的带宽。 In large databases, the code above is basically useless.在大型数据库中，上面的代码基本上是没用的。

I have been trying for a while to conjure up the query in rails/active_record, but have no luck even with the nice meta_where gem.我一直在尝试在 rails/active_record 中创建查询，但即使使用了不错的 meta_where gem 也没有运气。 If I have to, I will accept a solution in pure mysql query as well.如果必须，我也会接受纯 mysql 查询中的解决方案。

Edited: I would cache it (or add a "repeat" field to customers), though only for this simplified problem.编辑：我会缓存它（或向客户添加一个“重复”字段），尽管只是为了这个简化的问题。 The criteria for repeat customer can change by the client at any point (2 purchases, 3 purchases, 4 purchases etc), so unfortunately I do have to calculate it on the spot.客户可以随时更改回头客的标准（2 次购买、3 次购买、4 次购买等），所以很遗憾我必须当场计算。

Answer 1

SELECT p_date, COUNT(customers.id) FROM
(
  SELECT p_date - INTERVAL 1 day p_date, customers.id
  FROM
     customers NATURAL JOIN purchases
     JOIN (SELECT DISTINCT date(purchase_date) p_date FROM purchases) p_dates
  WHERE purchases.purchase_date < p_date
  GROUP BY p_date, customers.id
  HAVING COUNT(purchases.id) >= 2
) a
GROUP BY p_date

I didn't test this in the slightest, so I hope it works.我没有对此进行任何测试，所以我希望它有效。 Also, I hope I understood what you are trying to accomplish.另外，我希望我了解您要完成的工作。

But please note that you should not do this, it'll be too slow.但请注意，您不应该这样做，它会太慢。 Since the data never changes once the day is passed, just cache it for each day.由于一天过去后数据永远不会更改，因此只需将其缓存为每一天。

棘手的 Rails3/mysql 查询

问题描述

1 个解决方案

解决方案1
0 2011-06-14 06:15:06

棘手的 Rails3/mysql 查询

问题描述

1 个解决方案

解决方案1 0 2011-06-14 06:15:06

解决方案1
0 2011-06-14 06:15:06