简体   繁体   English

Rails:include vs.:join

[英]Rails :include vs. :joins

This is more of a "why do things work this way" question rather than a "I don't know how to do this" question... 这更像是“为什么会这样做”这个问题,而不是“我不知道该怎么做”这个问题......

So the gospel on pulling associated records that you know you're going to use is to use :include because you'll get a join and avoid a whole bunch of extra queries: 所以关于拉你知道你将要使用的相关记录的福音是使用:include因为你将获得一个连接并避免一大堆额外的查询:

Post.all(:include => :comments)

However when you look at the logs, there's no join happening: 但是,当您查看日志时,没有发生加入:

Post Load (3.7ms)   SELECT * FROM "posts"
Comment Load (0.2ms)   SELECT "comments.*" FROM "comments" 
                       WHERE ("comments".post_id IN (1,2,3,4)) 
                       ORDER BY created_at asc) 

It is taking a shortcut because it pulls all of the comments at once, but it's still not a join (which is what all the documentation seems to say). 正在采取一种捷径,因为它会立即提取所有注释,但它仍然不是连接(这是所有文档似乎都说的)。 The only way I can get a join is to use :joins instead of :include : 我可以获得连接的唯一方法是使用:joins而不是:include

Post.all(:joins => :comments)

And the logs show: 日志显示:

Post Load (6.0ms)  SELECT "posts".* FROM "posts" 
                   INNER JOIN "comments" ON "posts".id = "comments".post_id

Am I missing something? 我错过了什么吗? I have an app with half a dozen associations and on one screen I display data from all of them. 我有一个有六个关联的应用程序,在一个屏幕上我显示所有这些数据。 Seems like it would be better to have one join-ed query instead of 6 individuals. 似乎最好有一个加入查询而不是6个人。 I know that performance-wise it's not always better to do a join rather than individual queries (in fact if you're going by time spent, it looks like the two individual queries above are faster than the join), but after all the docs I've been reading I'm surprised to see :include not working as advertised. 我知道在性能方面,进行连接而不是单个查询并不总是更好(事实上,如果你花费时间,看起来上面的两个单独的查询比连接更快),但是在所有文档之后我一直在读,我很惊讶地看到:include不按宣传方式工作。

Maybe Rails is cognizant of the performance issue and doesn't join except in certain cases? 也许Rails 认识的性能问题,并除非在某些情况下,不加入呢?

It appears that the :include functionality was changed with Rails 2.1. 似乎使用Rails 2.1改变了:include功能。 Rails used to do the join in all cases, but for performance reasons it was changed to use multiple queries in some circumstances. Rails用于在所有情况下进行连接,但出于性能原因,在某些情况下将其更改为使用多个查询。 This blog post by Fabio Akita has some good information on the change (see the section entitled "Optimized Eager Loading"). Fabio Akita的这篇博客文章提供了有关变化的一些很好的信息(参见标题为“Optimized Eager Loading”的部分)。

.joins will just joins the tables and brings selected fields in return. .joins将只加入表并返回选定的字段。 if you call associations on joins query result, it will fire database queries again 如果在连接查询结果上调用关联,它将再次触发数据库查询

:includes will eager load the included associations and add them in memory. :includes会急切加载包含的关联并将它们添加到内存中。 :includes loads all the included tables attributes. :includes加载所有包含的表属性。 If you call associations on include query result, it will not fire any queries 如果在包含查询结果上调用关联,则不会触发任何查询

The difference between joins and include is that using the include statement generates a much larger SQL query loading into memory all the attributes from the other table(s). 连接和包含之间的区别在于,使用include语句会生成一个更大的SQL查询,将来自其他表的所有属性加载到内存中。

For example, if you have a table full of comments and you use a :joins => users to pull in all the user information for sorting purposes, etc it will work fine and take less time than :include, but say you want to display the comment along with the users name, email, etc. To get the information using :joins, it will have to make separate SQL queries for each user it fetches, whereas if you used :include this information is ready for use. 例如,如果你有一个充满注释的表,并且你使用:joins => users来提取所有用户信息以进行排序等,它将正常工作并且花费的时间少于:include,但是你想要显示注释以及用户名,电子邮件等。要使用以下方式获取信息:连接,它必须为其提取的每个用户单独进行SQL查询,而如果您使用:包含此信息已准备就绪,可以使用。

Great example: 好例子:

http://railscasts.com/episodes/181-include-vs-joins http://railscasts.com/episodes/181-include-vs-joins

In addition to a performance considerations, there's a functional difference too. 除了性能方面的考虑外,还存在功能差异。 When you join comments, you are asking for posts that have comments- an inner join by default. 当您加入评论时,您要求发布具有评论的帖子 - 默认情况下为内部联接。 When you include comments, you are asking for all posts- an outer join. 当您包含评论时,您要求所有帖子 - 外部联接。

I was recently reading more on difference between :joins and :includes in rails. 我最近在阅读更多关于:joins:includes在rails中的区别。 Here is an explaination of what I understood (with examples :)) 这是我理解的解释(用例子:))

Consider this scenario: 考虑这种情况:

  • A User has_many comments and a comment belongs_to a User. 用户has_many评论和评论belongs_to用户。

  • The User model has the following attributes: Name(string), Age(integer). User模型具有以下属性:Name(字符串),Age(整数)。 The Comment model has the following attributes:Content, user_id. Comment模型具有以下属性:Content,user_id。 For a comment a user_id can be null. 对于注释,user_id可以为null。

Joins: 连接:

:joins performs a inner join between two tables. join在两个表之间执行内部联接。 Thus 从而

Comment.joins(:user)

#=> <ActiveRecord::Relation [#<Comment id: 1, content: "Hi I am Aaditi.This is my first   comment!", user_id: 1, created_at: "2014-11-12 18:29:24", updated_at: "2014-11-12 18:29:24">, 
     #<Comment id: 2, content: "Hi I am Ankita.This is my first comment!", user_id: 2, created_at: "2014-11-12 18:29:29", updated_at: "2014-11-12 18:29:29">,    
     #<Comment id: 3, content: "Hi I am John.This is my first comment!", user_id: 3, created_at: "2014-11-12 18:30:25", updated_at: "2014-11-12 18:30:25">]>

will fetch all records where user_id (of comments table) is equal to user.id (users table). 将获取user_id(注释表)等于user.id(users表)的所有记录。 Thus if you do 因此,如果你这样做

Comment.joins(:user).where("comments.user_id is null")

#=> <ActiveRecord::Relation []>

You will get a empty array as shown. 您将获得一个空数组,如图所示。

Moreover joins does not load the joined table in memory. 此外,连接不会将连接的表加载到内存中。 Thus if you do 因此,如果你这样做

comment_1 = Comment.joins(:user).first

comment_1.user.age
#=>←[1m←[36mUser Load (0.0ms)←[0m  ←[1mSELECT "users".* FROM "users" WHERE "users"."id" = ? ORDER BY "users"."id" ASC LIMIT 1←[0m  [["id", 1]]
#=> 24

As you see, comment_1.user.age will fire a database query again in the background to get the results 如您所见, comment_1.user.age将在后台再次触发数据库查询以获取结果

Includes: 包括:

:includes performs a left outer join between the two tables. :includes在两个表之间执行左外连接 Thus 从而

Comment.includes(:user)

#=><ActiveRecord::Relation [#<Comment id: 1, content: "Hi I am Aaditi.This is my first comment!", user_id: 1, created_at: "2014-11-12 18:29:24", updated_at: "2014-11-12 18:29:24">,
   #<Comment id: 2, content: "Hi I am Ankita.This is my first comment!", user_id: 2, created_at: "2014-11-12 18:29:29", updated_at: "2014-11-12 18:29:29">,
   #<Comment id: 3, content: "Hi I am John.This is my first comment!", user_id: 3, created_at: "2014-11-12 18:30:25", updated_at: "2014-11-12 18:30:25">,    
   #<Comment id: 4, content: "Hi This is an anonymous comment!", user_id: nil, created_at: "2014-11-12 18:31:02", updated_at: "2014-11-12 18:31:02">]>

will result in a joined table with all the records from comments table. 将生成一个连接表,其中包含评论表中的所有记录。 Thus if you do 因此,如果你这样做

Comment.includes(:user).where("comment.user_id is null")
#=> #<ActiveRecord::Relation [#<Comment id: 4, content: "Hi This is an anonymous comment!", user_id: nil, created_at: "2014-11-12 18:31:02", updated_at: "2014-11-12 18:31:02">]>

it will fetch records where comments.user_id is nil as shown. 它将获取comments.user_id为零的记录,如图所示。

Moreover includes loads both the tables in the memory. 此外,还包括加载内存中的表。 Thus if you do 因此,如果你这样做

comment_1 = Comment.includes(:user).first

comment_1.user.age
#=> 24

As you can notice comment_1.user.age simply loads the result from memory without firing a database query in the background. 您可以注意到,comment_1.user.age只是从内存加载结果而不在后台触发数据库查询。

tl;dr TL;博士

I contrast them in two ways: 我用两种方式对比它们:

joins - For conditional selection of records. 连接 - 用于条件选择记录。

includes - When using an association on each member of a result set. includes - 在结果集的每个成员上使用关联时。

Longer version 更长的版本

Joins is meant to filter the result set coming from the database. 连接旨在过滤来自数据库的结果集。 You use it to do set operations on your table. 您可以使用它来对表进行设置操作。 Think of this as a where clause that performs set theory. 可以将其视为执行集合论的where子句。

Post.joins(:comments)

is the same as 是相同的

Post.where('id in (select post_id from comments)')

Except that if there are more than one comment you will get duplicate posts back with the joins. 除非如果有多个注释,您将通过联接返回重复的帖子。 But every post will be a post that has comments. 但每篇文章都将是一篇有评论的帖子。 You can correct this with distinct: 您可以使用distinct进行更正:

Post.joins(:comments).count
=> 10
Post.joins(:comments).distinct.count
=> 2

In contract, the includes method will simply make sure that there are no additional database queries when referencing the relation (so that we don't make n + 1 queries) 在契约中, includes方法将简单地确保在引用关系时没有其他数据库查询(这样我们就不会进行n + 1个查询)

Post.includes(:comments).count
=> 4 # includes posts without comments so the count might be higher.

The moral is, use joins when you want to do conditional set operations and use includes when you are going to be using a relation on each member of a collection. 道德是,当你想要进行条件集操作时使用joins当你要在集合的每个成员上使用关系时使用includes

.joins works as database join and it joins two or more table and fetch selected data from backend(database). .joins用作数据库连接,它连接两个或多个表并从后端(数据库)获取所选数据。

.includes work as left join of database. .includes作为数据库的左连接工作。 It loaded all the records of left side, does not have relevance of right hand side model. 它加载了左侧的所有记录,没有右侧模型的相关性。 It is used to eager loading because it load all associated object in memory. 它用于急切加载,因为它加载内存中的所有关联对象。 If we call associations on include query result then it does not fire a query on database, It simply return data from memory because it have already loaded data in memory. 如果我们在include查询结果上调用关联,那么它不会在数据库上触发查询,它只是从内存中返回数据,因为它已经在内存中加载了数据。

'joins' just used to join tables and when you called associations on joins then it will again fire query (it mean many query will fire) 'joins'刚刚用于连接表,当你在连接上调用关联时,它会再次触发查询(这意味着许多查询将触发)

lets suppose you have tow model, User and Organisation
User has_many organisations
suppose you have 10 organisation for a user 
@records= User.joins(:organisations).where("organisations.user_id = 1")
QUERY will be 
 select * from users INNER JOIN organisations ON organisations.user_id = users.id where organisations.user_id = 1

it will return all records of organisation related to user
and @records.map{|u|u.organisation.name}
it run QUERY like 
select * from organisations where organisations.id = x then time(hwo many organisation you have)

total number of SQL is 11 in this case 在这种情况下,SQL的总数是11

But with 'includes' will eager load the included associations and add them in memory(load all associations on first load) and not fire query again 但是'includes'会急切加载包含的关联并将它们添加到内存中(在第一次加载时加载所有关联)而不是再次触发查询

when you get records with includes like @records= User.includes(:organisations).where("organisations.user_id = 1") then query will be 当你得到像@ records = User.includes(:organizations).where(“organisations.user_id = 1”)这样的包含的记录时,查询将是

select * from users INNER JOIN organisations ON organisations.user_id = users.id where organisations.user_id = 1
and 


 select * from organisations where organisations.id IN(IDS of organisation(1, to 10)) if 10 organisation
and when you run this 

@records.map{|u|u.organisation.name} no query will fire @ records.map {| u | u.organisation.name}不会触发任何查询

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM