关于ruby：Rails：include vs.：join

Rails :include vs. :joins

这更多的是一个"为什么事情是这样工作的"问题，而不是一个"我不知道如何做这个"问题…

因此，关于提取相关记录的福音就是使用:include，因为您将获得一个连接并避免大量额外的查询：

1	Post.all(:include => :comments)

但是，当您查看日志时，不会发生连接：

1
2
3
4

Post Load (3.7ms) SELECT * FROM"posts"
Comment Load (0.2ms) SELECT"comments.*" FROM"comments"
WHERE ("comments".post_id IN (1,2,3,4))
ORDER BY created_at asc)

号

它走了一条捷径，因为它一次提取所有注释，但它仍然不是一个连接(所有文档都这么说)。我唯一能得到连接的方法是使用:joins而不是:include：

1	Post.all(:joins => :comments)

日志显示：

1 2	Post Load (6.0ms) SELECT"posts".* FROM"posts" INNER JOIN"comments" ON"posts".id ="comments".post_id

。

我错过什么了吗？我有一个应用程序，有六个关联，在一个屏幕上，我显示所有关联的数据。似乎最好是使用一个join-ed查询，而不是6个人。我知道从性能上来说，执行一个连接并不总是比执行单个查询更好(事实上，如果按花费的时间计算，上面的两个单个查询似乎比连接更快)，但是在我阅读了所有文档之后，我惊讶地看到:include没有像广告中那样工作。

也许Rails认识到了性能问题，除了在某些情况下，它不会加入？

相关讨论

似乎:include功能随Rails 2.1而改变。Rails曾经在所有情况下都进行连接，但出于性能原因，它在某些情况下被更改为使用多个查询。FabioAkita的这篇博客文章提供了一些关于变化的好信息(请参阅标题为"优化的热切加载"的部分)。

相关讨论

.joins将加入表并返回所选字段。如果对联接查询结果调用关联，它将再次启动数据库查询。

:includes将急切地加载所包含的关联并将它们添加到内存中。:includes加载所有包含的表属性。如果对include查询结果调用关联，则不会激发任何查询。

join和include的区别在于，使用include语句生成一个更大的SQL查询，将其他表的所有属性加载到内存中。

例如，如果您有一个满是注释的表，并且使用：joins=>用户拉入所有用户信息以进行排序等，那么它将工作正常，所花的时间将少于：include，但假设您希望显示注释以及用户名、电子邮件等。要使用：joins获取信息，它必须单独进行SQL查询。它获取的每个用户的IES，而如果您使用：包括这些信息就可以使用了。

很好的例子：

http://railscansts.com/si集/181-include-vs-joins

除了性能方面的考虑之外，还有功能上的差异。当你加入评论时，你要求的是有评论的帖子——默认情况下是内部连接。当你包含评论时，你要求所有的帖子——一个外部的加入。

我最近读了更多关于:joins和:includes在rails上的区别的文章。下面是对我所理解的内容的解释(举例：)

考虑这种情况：

一个用户有许多注释，一个注释属于一个用户。
用户模型具有以下属性：名称(字符串)、年龄(整数)。注释模型具有以下属性：内容、用户ID。对于注释，用户ID可以为空。

加入：

：joins在两个表之间执行内部联接。因此

1
2
3
4
5

Comment.joins(:user)

#=> <ActiveRecord::Relation [#<Comment id: 1, content:"Hi I am Aaditi.This is my first comment!", user_id: 1, created_at:"2014-11-12 18:29:24", updated_at:"2014-11-12 18:29:24">,
#<Comment id: 2, content:"Hi I am Ankita.This is my first comment!", user_id: 2, created_at:"2014-11-12 18:29:29", updated_at:"2014-11-12 18:29:29">,
#<Comment id: 3, content:"Hi I am John.This is my first comment!", user_id: 3, created_at:"2014-11-12 18:30:25", updated_at:"2014-11-12 18:30:25">]>

将获取用户标识(注释表的)等于用户标识(用户表的)的所有记录。所以如果你这样做的话

1
2
3

Comment.joins(:user).where("comments.user_id is null")

#=> <ActiveRecord::Relation []>

号

您将得到一个空数组，如图所示。

此外，联接不会在内存中加载联接的表。所以如果你这样做的话

1
2
3
4
5

comment_1 = Comment.joins(:user).first

comment_1.user.age
#=>←[1m←[36mUser Load (0.0ms)←[0m ←[1mSELECT"users".* FROM"users" WHERE"users"."id" = ? ORDER BY"users"."id" ASC LIMIT 1←[0m [["id", 1]]
#=> 24

如您所见，comment_1.user.age将在后台再次启动数据库查询以获得结果。

包括：

：include在两个表之间执行左外部联接。因此

1
2
3
4
5
6

Comment.includes(:user)

#=><ActiveRecord::Relation [#<Comment id: 1, content:"Hi I am Aaditi.This is my first comment!", user_id: 1, created_at:"2014-11-12 18:29:24", updated_at:"2014-11-12 18:29:24">,
#<Comment id: 2, content:"Hi I am Ankita.This is my first comment!", user_id: 2, created_at:"2014-11-12 18:29:29", updated_at:"2014-11-12 18:29:29">,
#<Comment id: 3, content:"Hi I am John.This is my first comment!", user_id: 3, created_at:"2014-11-12 18:30:25", updated_at:"2014-11-12 18:30:25">,
#<Comment id: 4, content:"Hi This is an anonymous comment!", user_id: nil, created_at:"2014-11-12 18:31:02", updated_at:"2014-11-12 18:31:02">]>

。

将生成一个包含注释表中所有记录的联接表。所以如果你这样做的话

1 2	Comment.includes(:user).where("comment.user_id is null") #=> #<ActiveRecord::Relation [#<Comment id: 4, content:"Hi This is an anonymous comment!", user_id: nil, created_at:"2014-11-12 18:31:02", updated_at:"2014-11-12 18:31:02">]>

它将获取comments.user_id为nil的记录，如图所示。

此外，还包括在内存中加载这两个表。所以如果你这样做的话

1
2
3
4

comment_1 = Comment.includes(:user).first

comment_1.user.age
#=> 24

。

正如您可以注意到的，comment_1.user.age只从内存加载结果，而不在后台触发数据库查询。

相关讨论

我用两种方式对比它们：

联接-用于记录的条件选择。

包括-对结果集的每个成员使用关联时。

较长版本

联接用于筛选来自数据库的结果集。您可以使用它在表上执行设置操作。把它当作执行集合论的WHERE子句。

Post.joins(:comments)。

与

江户十一〔一〕号

但是，如果有多个注释，您将使用联接返回重复的文章。但每一篇文章都会有评论。您可以用distinct更正此问题：

1
2
3
4

Post.joins(:comments).count
=> 10
Post.joins(:comments).distinct.count
=> 2

在契约中，includes方法只需确保在引用关系时没有额外的数据库查询(这样我们就不会进行n+1查询)。

1 2	Post.includes(:comments).count => 4 # includes posts without comments so the count might be higher.

。

其道义是，当您想执行条件集操作时使用joins，当您要对集合的每个成员使用关系时使用includes。

相关讨论

.join用作数据库联接，它联接两个或多个表，并从后端(数据库)提取所选数据。

.包括作为数据库左联接的工作。它加载了所有左侧的记录，与右侧模型没有相关性。它用于紧急加载，因为它在内存中加载所有关联的对象。如果我们对include查询结果调用关联，那么它不会在数据库上触发查询，它只是从内存中返回数据，因为它已经在内存中加载了数据。

"joins"只用于联接表，当您对联接调用关联时，它将再次触发查询(这意味着将触发许多查询)

1
2
3
4
5
6
7
8
9
10
11

lets suppose you have tow model, User and Organisation
User has_many organisations
suppose you have 10 organisation for a user
@records= User.joins(:organisations).where("organisations.user_id = 1")
QUERY will be
select * from users INNER JOIN organisations ON organisations.user_id = users.id where organisations.user_id = 1

it will return all records of organisation related to user
and @records.map{|u|u.organisation.name}
it run QUERY like
select * from organisations where organisations.id = x then time(hwo many organisation you have)

在这种情况下，SQL的总数是11

但是有"include"将预先加载包含的关联并将它们添加到内存中(在第一次加载时加载所有关联)，而不会再次激发查询。

当你得到包含如下内容的记录时@records=user.includes(：organizations).where("organizations.user_id=1")那么查询将是

1
2
3
4
5
6

select * from users INNER JOIN organisations ON organisations.user_id = users.id where organisations.user_id = 1
and

select * from organisations where organisations.id IN(IDS of organisation(1, to 10)) if 10 organisation
and when you run this

号

@记录.地图U U.组织.名称不会触发任何查询