Ruby on Rails的独特记录

Question

What is the best way to return the unique records from the database, please consider the following : 从数据库返回唯一记录的最佳方法是什么，请考虑以下几点：

@users = User.joins('LEFT JOIN subscriptions s ON users.id = s.user_id').includes(:profile).with_deleted.where("...", params[:conditions]).order("users.#{sort_column}" + ' ' + sort_direction).page params[:page]

It has fair amount of joins and conditions and paging. 它具有相当数量的联接和条件以及分页。 So for now the users are not unique. 因此，目前用户还不是唯一的。 This is one of the ways to make it unique : 这是使其独特的一种方法：

@users = @users.select('DISTINCT(users.id), users.created_at, users.deleted_at , ...')

However this seems to be very slow and I see a lot of explains in the log, which tells me it's not a good query. 但是，这似乎很慢，并且我在日志中看到很多解释，这告诉我这不是一个很好的查询。

I also tried using the uniq method like : 我也尝试使用uniq方法，例如：

@users = @users.uniq{|u| [u.email]}

This seems to be running the while longer (timeouts the web worker) than the above statement. 这似乎比上面的语句运行的时间更长（Web工作者超时）。 What is the correct way to de-duplicate the records? 删除重复记录的正确方法是什么？ Or what would be optimal thing to do in this kind of situation? 或在这种情况下最佳的做法是什么？

There is about 120k users, however only 25 should be displayed at one time, hence the .page method in the first/second statements. 大约有12万个用户，但是一次只能显示25个用户，因此在first / second语句中使用.page方法。

Answer 1

uniq is a method of Array , so it returns the whole bunch of 120k users and iterates through them one by one using ruby to check the condition. uniq是Array的一种方法，因此它返回整个120k用户，并使用ruby逐个迭代检查条件。 This is definetely the wrong way to do filtering. 这绝对是进行过滤的错误方法。

On the other hand, DISTINCT(users.id) is SQL condition which is handled by your PostgreSQL server. 另一方面， DISTINCT(users.id)是由您的PostgreSQL服务器处理的SQL条件。 This one should be executed pretty fast. 这应该很快执行。 In case it takes some significant time, you should double-check your indexes ( users.id , subscriptions.user_id , profiles.user_id and basically all of the primary and foreign keys as well as attributes which can be queried in your where clause). 如果花费大量时间，则应仔细检查索引（ users.id ， subscriptions.user_id ， profiles.user_id以及基本上所有的主键和外键以及可以在where子句中查询的属性）。

ActiveRecord has distinct method to speify a uniqueness constraint, but its implementation simply uses arel to do the same SQL DISTINCT query, so there should be no perfomance difference. ActiveRecord 有 distinct方法来指定唯一性约束，但是其实现仅使用arel来执行相同的SQL DISTINCT查询，因此性能上不应存在差异。

PS: just as a sidenote, there is no need to enumerate all the desired fields of users in your select query. PS：就像一个旁注一样，您无需在select查询中枚举users所需的所有字段。 The following should select all the fields of users table for you: 以下应为您选择users表的所有字段：

@users = @users.select('DISTINCT(users.id), users.*')

Answer 2

Check the documentation for distinct 检查文档是否与众不同

Also note that in your third example you are loading all the elements to memory and then making the operations, which is slow and memory hungry. 还要注意，在第三个示例中，您正在将所有元素加载到内存中，然后进行操作，这很慢并且占用大量内存。

You should opt to instruct the DBMS for unique records, by using distinct. 您应该选择通过使用distinct来指示DBMS获得唯一记录。

Ruby on Rails的独特记录

问题描述

2 个解决方案

解决方案1
1 已采纳 2014-09-29 18:27:47

解决方案2
0 2014-09-29 18:21:54

Ruby on Rails的独特记录

问题描述

2 个解决方案

解决方案1 1 已采纳 2014-09-29 18:27:47

解决方案2 0 2014-09-29 18:21:54

解决方案1
1 已采纳 2014-09-29 18:27:47

解决方案2
0 2014-09-29 18:21:54