关于sql：检索每个组中的最后一条记录 – MySQL

Retrieving the last record in each group - MySQL

有一个表messages包含如下所示的数据：

1
2
3
4
5
6
7
8

Id Name Other_Columns
-------------------------
1 A A_data_1
2 A A_data_2
3 A A_data_3
4 B B_data_1
5 B B_data_2
6 C C_data_1

如果我运行一个查询select * from messages group by name，我将得到以下结果：

1
2
3

1 A A_data_1
4 B B_data_1
6 C C_data_1

什么查询将返回以下结果？

1
2
3

3 A A_data_3
5 B B_data_2
6 C C_data_1

也就是说，应该返回每个组中的最后一条记录。

目前，这是我使用的查询：

1
2
3
4
5
6
7

SELECT
*
FROM (SELECT
*
FROM messages
ORDER BY id DESC) AS x
GROUP BY name

但这看起来效率很低。还有其他方法可以达到同样的效果吗？

相关讨论

MySQL8.0现在支持窗口功能，就像几乎所有流行的SQL实现一样。使用此标准语法，我们可以编写最大的每组n个查询：

1
2
3
4
5

WITH ranked_messages AS (
SELECT m.*, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) AS rn
FROM messages AS m
)
SELECT * FROM ranked_messages WHERE rn = 1;

下面是我在2009年为这个问题写的原始答案：

我这样写解决方案：

1
2
3
4

SELECT m1.*
FROM messages m1 LEFT JOIN messages m2
ON (m1.name = m2.name AND m1.id < m2.id)
WHERE m2.id IS NULL;

在性能方面，根据数据的性质，一种或另一种解决方案可能更好。因此，您应该测试这两个查询，并使用一个在给定数据库的情况下性能更好的查询。

例如，我有一个StackOverflow August数据转储的副本。我会用它作为基准。Posts表中有1114357行。这是在我的MacBookPro 2.40GHz上的MySQL5.0.75上运行的。

我将编写一个查询来查找给定用户ID(我的)的最新文章。

首先在子查询中使用@eric和GROUP BY所示的技术：

1
2
3
4
5
6
7
8

SELECT p1.postid
FROM Posts p1
INNER JOIN (SELECT pi.owneruserid, MAX(pi.postid) AS maxpostid
FROM Posts pi GROUP BY pi.owneruserid) p2
ON (p1.postid = p2.maxpostid)
WHERE p1.owneruserid = 20860;

1 ROW IN SET (1 MIN 17.89 sec)

即使是EXPLAIN分析也需要16秒的时间：

1
2
3
4
5
6
7
8

现在，使用我对LEFT JOIN的技术生成相同的查询结果：

1
2
3
4
5
6

SELECT p1.postid
FROM Posts p1 LEFT JOIN posts p2
ON (p1.owneruserid = p2.owneruserid AND p1.postid < p2.postid)
WHERE p2.postid IS NULL AND p1.owneruserid = 20860;

1 ROW IN SET (0.28 sec)

EXPLAIN分析表明，两个表都能使用其索引：

1
2
3
4
5
6
7

+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| id | select_type | TABLE | TYPE | possible_keys | KEY | key_len | REF | ROWS | Extra |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| 1 | SIMPLE | p1 | REF | OwnerUserId | OwnerUserId | 8 | const | 1384 | USING INDEX |
| 1 | SIMPLE | p2 | REF | PRIMARY,PostId,OwnerUserId | OwnerUserId | 8 | const | 1384 | USING WHERE; USING INDEX; NOT EXISTS |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
2 ROWS IN SET (0.00 sec)

这是我的Posts表的DDL：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

CREATE TABLE `posts` (
`PostId` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`PostTypeId` BIGINT(20) UNSIGNED NOT NULL,
`AcceptedAnswerId` BIGINT(20) UNSIGNED DEFAULT NULL,
`ParentId` BIGINT(20) UNSIGNED DEFAULT NULL,
`CreationDate` datetime NOT NULL,
`Score` INT(11) NOT NULL DEFAULT '0',
`ViewCount` INT(11) NOT NULL DEFAULT '0',
`Body` text NOT NULL,
`OwnerUserId` BIGINT(20) UNSIGNED NOT NULL,
`OwnerDisplayName` VARCHAR(40) DEFAULT NULL,
`LastEditorUserId` BIGINT(20) UNSIGNED DEFAULT NULL,
`LastEditDate` datetime DEFAULT NULL,
`LastActivityDate` datetime DEFAULT NULL,
`Title` VARCHAR(250) NOT NULL DEFAULT '',
`Tags` VARCHAR(150) NOT NULL DEFAULT '',
`AnswerCount` INT(11) NOT NULL DEFAULT '0',
`CommentCount` INT(11) NOT NULL DEFAULT '0',
`FavoriteCount` INT(11) NOT NULL DEFAULT '0',
`ClosedDate` datetime DEFAULT NULL,
PRIMARY KEY (`PostId`),
UNIQUE KEY `PostId` (`PostId`),
KEY `PostTypeId` (`PostTypeId`),
KEY `AcceptedAnswerId` (`AcceptedAnswerId`),
KEY `OwnerUserId` (`OwnerUserId`),
KEY `LastEditorUserId` (`LastEditorUserId`),
KEY `ParentId` (`ParentId`),
CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`PostTypeId`) REFERENCES `posttypes` (`PostTypeId`)
) ENGINE=InnoDB;

相关讨论

真的？如果你有大量的条目会发生什么？例如，如果您正在使用内部版本控制，比如说，并且每个文件都有大量的版本，那么连接结果将是巨大的。您是否曾经用这个方法对子查询方法进行基准测试？我很想知道哪一个会赢，但不想先问你。
谢谢比尔。那很好用。您能提供更多关于这个查询对Eric提供的连接的性能的信息吗？
做了一些测试。在一张小桌子上(约30万条记录，约19万组，所以不是大量的组或任何东西)，查询被绑定(每个8秒)。
我要注意的是，它有一个组合键，没有索引。那是一个废弃的舞台。)
哇，好消息。我在SQLServer2008上运行了测试，所以很有趣地看到MySQL与这些查询的不同之处。再次向你展示，explain是你的朋友！
啊哈！我想知道你是怎么从我这里得到如此不同的结果的。在使用GROUP BY的很多情况下，mysql在磁盘上创建了一个临时表，导致了昂贵的I/O，如果在mysql中可以的话，最好避免使用GROUP BY。是的，当性能很重要时，总是用explain分析查询。
@比尔：SQL Server讨厌or，MySQL讨厌GROUP BY。总有一天我们会得到一个RDBMS，它喜欢所有的SQL。考虑到explain，如果您将where子句放在子查询中，也会返回一个更小的行集。
@埃里克：关于在子查询中加入一个where限制，是的，但是你也不需要GROUP BY。
@比尔：啊，我总是忘了MySQL会让你放弃GROUP BY。当然，删除它将是为特定用户运行该查询的最有效方法。SQL Server在其GROUP BY中的宽容度较低。如果在select中，就必须在GROUP BY中。当然，它可以用over条款来完成，这真的很神奇。
正如@newt所指出的，这个查询对于我来说很慢(在SQL Server 2008上为10分钟以上)，数据集很大。我需要从350万行表中选择每个组的最后一个数据。
@jyelton，对于SQL Server 2008，应该使用带有窗口功能的CTE。
@Billkarwin：请参见meta.stackexchange.com/questions/123017，特别是亚当·拉基斯回答下面的评论。如果你想收回你对这个新问题的回答，请告诉我。
@罗伯特哈维，谢谢，我会跟进你链接到的元帖子。
选择m1.*从消息m1左联接消息m2打开(m1.name=m2.name，m1.id
@webenformasyon，按照您编写该条件的方式，如果m1.anotherid为零，查询将失败。您没有比较项，只将另一个ID视为布尔值。
我刚刚投了反对票。投反对票的人，你能解释一下你为什么反对这个答案吗？也许我可以改进它。
只是想提到这个解决方案也可以在Derby数据库上工作。
@billkarwin这不适用于非唯一ID的sice <比较-是否可以以某种方式将其与<=一起使用，因此当您有重复的ID时，它可以工作？
@蒂姆，不，如果你有一个非唯一的列，<=将不会有帮助。必须使用唯一列作为分隔线。
当行数增加或组变大时，性能呈指数级下降。例如，由5个日期组成的组将通过左联接产生4+3+2+1+1=11行，其中一行在末尾被过滤掉。与分组结果连接的性能几乎是线性的。你的测试看起来有缺陷。
@尽管如此，我还是运行了这些测试并得到了我显示的结果。如果你想做你自己的测试，并张贴你自己的答案显示结果，请做我的客人。
@Billkarwin如果我有一个非唯一的列，可以做些什么来消除重复项？
@艾哈迈德，参见牛顿答案中的"解决方案2"。
@billkarwin我的解决方法是在您建议的查询中的join和where子句中添加了一个额外的or条件：select m1.*from messages m1 left join messages m2 on(m1.name=m2.name and(m1.id
@Billkarwin你从哪里得到StackOverflow数据库的副本？
@wakantanka见meta.stackexchange.com/questions/224873/&hellip；
@Billkarwin谢谢你的询问。我对SQL/Join非常陌生，我想知道如何修改同一个查询以执行类似的操作：1)获取第一条记录而不是最后一条记录；2)仅获取特定日期的记录(我的表有一个日期字段)。再次感谢
@梅农，鉴于我上面所展示的，你应该能够自己做到这一点。
这是天赐的。您的查询完成了任务，而且速度非常快。我需要抓取每个用户的最新登录时间，它工作了。谢谢！
如果解释所花的时间几乎和查询本身一样长，这是否意味着如果这是一个准备好的查询，那么运行起来会更快，因为看起来大多数时间都花在决定如何检索数据上，而不是检索数据上。
@克朗彻，不一定。解释需要很长时间，因为优化器实际上对派生表执行子查询并为其创建临时表，然后才能估计优化计划。
在谷歌搜索5天后，你的答案是准确的。其他人都在胡闹
@Billkarwin(第1部分，共3部分)在8.0.4-rc上使用我的测试数据(大约150万行，100个组)进行测试，这表明您的查询可能非常低效：
@Billkarwin(第2部分，共3部分)首先，使用左联接方法查询消息需要花费太长时间，所以我不得不取消它。分区查询需要2秒以上的时间。我的limit 1查询只需要一秒钟的时间(尽管效率仍然很低，但我还没有看到更有效的方法)：从(从消息m2中选择distinct m2.name)中选择m1.id、m1.name、m1.other_列(从消息m2中选择distinct m2.name)g inner join messages m1 on m1.id=(从消息m3中选择m3.id，其中m3.name=g.name order by m3.id desc limit 1)；
@Billkarwin(第3部分，共3部分)同样，对于您的posts查询，使用我的测试数据需要20秒以上。但是，我的限制1查询需要0秒：从posts p中选择p.postid，其中p.owneruserid=20860，按p.postid desc limit 1排序；我想知道使用您的测试数据我的限制1查询需要多长时间？
@Yoseph，你应该不会感到惊讶，因为我还没有2009年这个答案所用的测试数据。不管怎样，对你做测试很有好处。根据数据集和MySQL的版本，不同的解决方案可能更好。只有通过测试，我们才能确定哪一个最适合任何特定情况。
@Billkarwin，即使你的最新更新也不能在Mariadb 10.0.34上工作。
@胡曼，我不使用或支持Mariadb。但是根据他们关于with语法的文档，他们直到Mariadb 10.2.1才支持它。
@Billkarwin有没有机会测试你的MySQL8答案(MySQL8只是最近发布的)？使用我的测试数据也证明是低效的，请参阅stackoverflow.com/questions/1313120/&hellip；(如果您测试了该答案，那么我将非常感兴趣了解我的limit 1方法如何使用您的测试数据)
@约瑟夫，关于这个问题你已经有我的答案了。不同版本的MySQL有不同的行为。数据的差异也会导致不同的行为。您测试并发现了一个对数据集性能良好的查询解决方案。称之为胜利，继续前进。

(P)Upd:2017-03-31，the version 5.7.5 of Mysql made the only□Ufull□UU Group?UU UU UU UU UU UU UU UU UU?by switch enabled by default(Hence，Non-deterministic Group by Queries became Disabled).Moreover，they updated the group by implementation and the solution might not work as expected anymore even with the disabled switch.一个需要检查。(p)(P)Bill Karwin's solution above works fine when item count within groups is rather small，but the performance of the query becomes bad when the groups are rather large，since the solution requires about EDOCX1 original/of only EDOCX1(p)(P)I made my tests on a innodb table of EDOCX1 university 2 common rows with EDOCX1The table contains testresults for functional tests and has the EDOCX1 single 4 as the primary key.Thus，EDOCX1(英文)5 is a group and I was searching for the last EDOCX1(p)(P)Bill's solution has already been running for several hours on my dell E4310 and I do not know when it i s going to finish even though it operates on a coverage index(Hence EDOCX1 penal 8).(p)(P)I have a couple of other solutions that are based on the same ideas：(p)

If the underlying index is btree index(which is usually the case)，the largest EDOCX1 original 9.Pair is the last value within each EDOCX1 original 10，that is the first for each EDOCX1 penogical 10.If we walk through the index in descending order；
如果我们读到了一份指数所涵盖的价值，那么这些价值就可以在指数的顺序中读到。
EACH Index implicability contains primary key columns appended to that(that is the primary key is in the coverage index).在解决问题的过程中，我直接在你的案件中发挥作用，你只需要在结果中增加一个关键的专栏。
In many cases it is much cheaper to collect the required row ids in the required order in a subquery and join the result of the subquery on the id.Since for each row in the subquery result mysql will need a single fetch based on primary key，the subquery will be put first in the join and the rows will be output in the order of the IDS in the subqErry(if we omit explicit order by the join)

(P)3 ways mysql uses indexes is a great article to understand some details.(p)(P)解决方案1(p)(P)This one is incredibly fast，it takes about 0.8 secs on my 18M+rows：(p)字母名称(P)If you want to change the order to asc，put it in a subquery，return the IDS only and use that as the subquery to join to the rest of the columns：(p)字母名称(P)This one takes about 1.2 secs on my data.(p)(P)解决方案2(p)(P)This is another solution that take about 19 seconds for my table：(p)字母名称(P)It returns tests in descending order as well.It is much slower since it does a full index scan but it is here to give you a n idea how to output n max rows for each group.(p)(P)The disadvantage of the query is that its result cannot be checked by the query cache.(p)

相关讨论

使用子查询返回正确的分组，因为您已经完成了一半。

试试这个：

1
2
3
4
5
6
7

SELECT
a.*
FROM
messages a
INNER JOIN
(SELECT name, MAX(id) AS maxid FROM messages GROUP BY name) AS b ON
a.id = b.maxid

如果不是id，您需要的最大值是：

1
2
3
4
5
6
7
8
9

SELECT
a.*
FROM
messages a
INNER JOIN
(SELECT name, MAX(other_col) AS other_col
FROM messages GROUP BY name) AS b ON
a.name = b.name
AND a.other_col = b.other_col

通过这种方式，可以避免子查询中的相关子查询和/或排序，这往往非常缓慢/效率低下。

相关讨论

(P)I arrived at a different solution，which is to get the IDS for the last post within each group，they select from the messages table using the result from the first query as the argument for a EDOCX1 indicatoriginal 12 occupation：(p)字母名称(P)I don't know how this performs compared to some of the other solutions，but i t worked spectacularly for my table with 3+million rows.(4 Second Execution with 1200+Results)(p)(P)This should work both on mysql and sql server.(p)

相关讨论

(P)次贷Fiddle Link解决方案(p)字母名称(P)与附加条件相结合的解决方案(p)字母名称(P)Reason for this post is to give fiddle link only.Same sql is already provided in other answers.(p)

相关讨论

(P)I've not yet tested with large db but I think this could be faster than joining tables：(p)字母名称

相关讨论

这里是我的解决方案：

1
2
3
4

SELECT
DISTINCT NAME,
MAX(MESSAGES) OVER(PARTITION BY NAME) MESSAGES
FROM MESSAGE;

这里有两个建议。首先，如果mysql支持row_number()，非常简单：

1
2
3
4
5
6
7
8
9
10
11

WITH Ranked AS (
SELECT Id, Name, OtherColumns,
ROW_NUMBER() OVER (
PARTITION BY Name
ORDER BY Id DESC
) AS rk
FROM messages
)
SELECT Id, Name, OtherColumns
FROM messages
WHERE rk = 1;

我假设"last"是指身份证上的last。如果没有，则相应地更改row_number()窗口的order by子句。如果row_number()不可用，这是另一个解决方案：

其次，如果没有，这通常是一个很好的方法：

1
2
3
4
5
6
7
8

SELECT
Id, Name, OtherColumns
FROM messages
WHERE NOT EXISTS (
SELECT * FROM messages AS M2
WHERE M2.Name = messages.Name
AND M2.Id > messages.Id
)

换言之，选择没有具有相同名称的后续ID消息的消息。

相关讨论

安全与速度的方法是如下。

1
2
3

SELECT *
FROM messages a
WHERE Id = (SELECT MAX(Id) FROM messages WHERE a.Name = Name)

结果

1
2
3
4

Id Name Other_Columns
3 A A_data_3
5 B B_data_2
6 C C_data_1

(P)这是另一种方法，可以得到最后相关的记录，使用EDOCX1，带有字母顺序和EDOCX1，字母名称14，与Pick one of the record from the list(p)字母名称(P)Above query will group the all the EDOCX1 penographic 15 communal that are in same EDOCX1 universal 16 group and using EDOCX1 universitable 17/American will join all the EDOCX1 penographic 15 in a specific group in descending order with the provided separator in my case I have used EDOCX1 individual 19，using EDOCX1 14 over this list will pick the first one.(p)Fiddle Demo

1
2
3
4
5
6
7
8
9
10
11
12

SELECT
column1,
column2
FROM
TABLE_NAME
WHERE id IN
(SELECT
MAX(id)
FROM
TABLE_NAME
GROUP BY column1)
ORDER BY column1 ;

相关讨论

试试这个：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

SELECT jos_categories.title AS name,
joined .catid,
joined .title,
joined .introtext
FROM jos_categories
INNER JOIN (SELECT *
FROM (SELECT `title`,
catid,
`created`,
introtext
FROM `jos_content`
WHERE `sectionid` = 6
ORDER BY `id` DESC) AS yes
GROUP BY `yes`.`catid` DESC
ORDER BY `yes`.`created` DESC) AS joined
ON( joined.catid = jos_categories.id )

从这里你可以把观以及。

HTTP:/ / sqlfiddle.com / #！9 / 9 / ef42b

第一个解决方案

1
2
3

SELECT d1.ID,Name,City FROM Demo_User d1
INNER JOIN
(SELECT MAX(ID) AS ID FROM Demo_User GROUP BY NAME) AS P ON (d1.ID=P.ID);

第二解

1	SELECT * FROM (SELECT * FROM Demo_User ORDER BY ID DESC) AS T GROUP BY NAME ;

相关讨论

清楚的是有很多不同的方式得到相同的结果，你的问题是什么似乎是安全有效的方法得到的结果在一组在最后mysql。如果你是工作与巨大的少量的数据和假设你是与使用的最新版本的innodb甚至mysql(如5.7.21和8.0.4-rc)，然后有可能不安全的方式，这是有效的。

有时我们需要做这与表行与甚至超过60万。

这些例子中我将使用为只有约150万行数据与那里的queries会找到所有需要的结果为在数据组。在我们的情况下，我们往往需要将实际数据从归来后约2000组(这会非常hypothetically不要求检查公布的数据)。

我会使用下面的表：

1
2
3
4
5
6
7
8
9
10

CREATE TABLE temperature(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
groupID INT UNSIGNED NOT NULL,
recordedTimestamp TIMESTAMP NOT NULL,
recordedValue INT NOT NULL,
INDEX groupIndex(groupID, recordedTimestamp),
PRIMARY KEY (id)
);

CREATE TEMPORARY TABLE selected_group(id INT UNSIGNED NOT NULL, PRIMARY KEY(id));

表的温度是约150万populated与随机的记录，和100不同的组。 _组的选择是与那些populated 100组(在我们的情况下，这是不20%煤通常会为所有组)。

这个数据是随机的，因为这意味着可以有多个行recordedtimestamps相同。什么我们想做的是得到一个列出的顺序在所有选定的组的最后一recordedtimestamp groupid与对每一组，同一组，如果有超过一个matching行像，然后最后matching ID的那些行。

如果有一个hypothetically mysql()函数返回最后的价值，从最后一行在一个特殊的顺序，然后由条款我们可以简单的原因：

1
2
3
4
5
6
7
8
9

SELECT
LAST(t1.id) AS id,
t1.groupID,
LAST(t1.recordedTimestamp) AS recordedTimestamp,
LAST(t1.recordedValue) AS recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
ORDER BY t1.recordedTimestamp, t1.id
GROUP BY t1.groupID;

这将只需要几行examine 100在这种情况下，因为它不使用任何由当日公布的正常组。这会execute在0秒，因此是高度有效的。注意，通常我们会看到，在mysql安全秩序由集团由以下条款条款条款顺序，然而这是由用于确定最后的顺序为()函数，如果它是由集团后，然后它会ordering的组。如果没有集团目前是由条款，然后最后的价值将是相同的在所有的返回的行。

然而这不是有mysql并不比让我们看看有什么不同的想法和对prove，所有这些是有效的。

一个例子

1
2
3
4
5
6
7
8
9

SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT t2.id
FROM temperature t2
WHERE t2.groupID = g.id
ORDER BY t2.recordedTimestamp DESC, t2.id DESC
LIMIT 1
);

这examined 3009254行和带* 0.859秒在5.7.21和少量的长在8.0.4-rc

例子2

1
2
3
4
5
6
7
8
9
10
11
12
13

SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
INNER JOIN (
SELECT MAX(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, MAX(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
) t5 ON t5.id = t1.id;

这examined 1505331行和带* 1.25秒在5.7.21和少量的长在8.0.4-rc

三例

1
2
3
4
5
6
7
8
9
10
11
12
13
14

SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
WHERE t1.id IN (
SELECT MAX(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, MAX(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
)
ORDER BY t1.groupID;

这examined 3009685行和带* 1.95秒在5.7.21和少量的长在8.0.4-rc

4例

1
2
3
4
5
6
7
8
9
10
11

SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT MAX(t2.id)
FROM temperature t2
WHERE t2.groupID = g.id AND t2.recordedTimestamp = (
SELECT MAX(t3.recordedTimestamp)
FROM temperature t3
WHERE t3.groupID = g.id
)
);

把这一行和examined 6137810 * 2. 2秒在5.7.21和少量的长在8.0.4-rc

5例

1
2
3
4
5
6
7
8
9
10
11
12
13

SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
t2.id,
t2.groupID,
t2.recordedTimestamp,
t2.recordedValue,
ROW_NUMBER() OVER (
PARTITION BY t2.groupID ORDER BY t2.recordedTimestamp DESC, t2.id DESC
) AS rowNumber
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
) t1 WHERE t1.rowNumber = 1;

这examined 6017808行和带* 4.2秒在8.0.4-rc

6例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
last_value(t2.id) OVER w AS id,
t2.groupID,
last_value(t2.recordedTimestamp) OVER w AS recordedTimestamp,
last_value(t2.recordedValue) OVER w AS recordedValue
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
WINDOW w AS (
PARTITION BY t2.groupID
ORDER BY t2.recordedTimestamp, t2.id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
) t1
GROUP BY t1.groupID;

这examined 6017908行和带* 17.5秒在8.0.4-rc

7例

1
2
3
4
5
6
7
8
9
10
11

SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
LEFT JOIN temperature t2
ON t2.groupID = g.id
AND (
t2.recordedTimestamp > t1.recordedTimestamp
OR (t2.recordedTimestamp = t1.recordedTimestamp AND t2.id > t1.id)
)
WHERE t2.id IS NULL
ORDER BY t1.groupID;

这一个是以永远比我有大杀了它。

如果你的希望"vijay dev留言表包含ID，是汽车的主要关键increment我们取的最新记录的基础上，对主要关键查询应读为如下：

1	SELECT m1.* FROM messages m1 INNER JOIN (SELECT MAX(Id) AS lastmsgId FROM messages GROUP BY Name) m2 ON m1.Id=m2.lastmsgId

相关讨论

如果你想对每一Name最后一行，然后你能给一个大组的每一行行数由Name由Id在descending秩序和秩序。

查询

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

SELECT t1.Id,
t1.Name,
t1.Other_Columns
FROM
(
SELECT Id,
Name,
Other_Columns,
(
CASE Name WHEN @curA
THEN @curRow := @curRow + 1
ELSE @curRow := 1 AND @curA := Name END
) + 1 AS rn
FROM messages t,
(SELECT @curRow := 0, @curA := '') r
ORDER BY Name,Id DESC
)t1
WHERE t1.rn = 1
ORDER BY t1.Id;

SQL fiddle

根据您的问题，下面的查询将正常工作。

1
2
3
4
5
6
7
8
9

SELECT M1.*
FROM MESSAGES M1,
(
SELECT SUBSTR(Others_data,1,2),MAX(Others_data) AS Max_Others_data
FROM MESSAGES
GROUP BY 1
) M2
WHERE M1.Others_data = M2.Max_Others_data
ORDER BY Others_data;

我们是否可以使用此方法删除表中的重复项？结果集基本上是唯一记录的集合，所以如果我们可以删除结果集中没有的所有记录，那么我们将有效地没有重复的记录？我试过了，但是MySQL出了1093个错误。

1
2
3
4
5

DELETE FROM messages WHERE id NOT IN
(SELECT m1.id
FROM messages m1 LEFT JOIN messages m2
ON (m1.name = m2.name AND m1.id < m2.id)
WHERE m2.id IS NULL)

有没有一种方法可以将输出保存到临时变量，然后从非中删除(临时变量)？@比尔感谢你提供了一个非常有用的解决方案。

编辑：我想我找到了解决方案：

1
2
3
4
5
6
7
8
9
10

DROP TABLE IF EXISTS UniqueIDs;
CREATE TEMPORARY TABLE UniqueIDs (id INT(11));

INSERT INTO UniqueIDs
(SELECT T1.ID FROM TABLE T1 LEFT JOIN TABLE T2 ON
(T1.Field1 = T2.Field1 AND T1.Field2 = T2.Field2 #Comparison FIELDS
AND T1.ID < T2.ID)
WHERE T2.ID IS NULL);

DELETE FROM TABLE WHERE id NOT IN (SELECT ID FROM UniqueIDs);

1	SELECT * FROM messages GROUP BY name DESC

相关讨论

关于这个如何：

1
2
3

SELECT DISTINCT ON (name) *
FROM messages
ORDER BY name, id DESC;

我有类似的问题(在postgresql艰难)和在一个3英尺的记录表。本文以1.7s VS 44s溶液产生一个与左连接。在我的情况下我有大的滤波corrispondant实地对null价值的你的名字，甚至更好的performances 0.2 resulting由设置在

如果性能是您真正关心的问题，您可以在表中引入一个名为IsLastInGroup的bit类型的新列。

在最后一列上将其设置为true，并在每行插入/更新/删除时对其进行维护。写入速度会变慢，但在读取时会受益。它取决于您的用例，我建议您只有在以阅读为中心的情况下才使用它。

因此，您的查询将如下所示：

1	SELECT * FROM Messages WHERE IsLastInGroup = 1

select*from table_name where primary_key in(select max(primary_key)from table_name group by column_name)

相关讨论

我们将研究如何使用MySQL获取分组中的最后一条记录。例如，如果您有这组文章的结果。

id category_id post_title

1 1 Title 1

2 1 Title 2

3 1 Title 3

4 2 Title 4

5 2 Title 5

6 3 Title 6

我希望能够得到每一个类别的最后一个职位，即标题3，标题5和标题6。要按类别获取文章，您将使用mysql group by keyboard。

select * from posts group by category_id

但是我们从这个查询中得到的结果是。

id category_id post_title

1 1 Title 1

4 2 Title 4

6 3 Title 6

Group By将始终返回组中结果集的第一条记录。

SELECT id, category_id, post_title
FROM posts
WHERE id IN (
SELECT MAX(id)
FROM posts
GROUP BY category_id
);

这将返回每个组中ID最高的帖子。

id category_id post_title

3 1 Title 3

5 2 Title 5

6 3 Title 6

引用单击此处

您可以按计数分组，还可以获取组的最后一项，如：

1
2
3
4
5
6

SELECT
USER,
COUNT(USER) AS COUNT,
MAX(id) AS LAST
FROM request
GROUP BY USER

你看到过https://github.com/fhulufhelo/get-last-record-in-each-mysql-group吗？它对我有用

1	$sql ="SELECT c.id, c.name, c.email, r.id, r.companyid, r.name, r.email FROM companytable c LEFT JOIN ( SELECT * FROM revisiontable WHERE id IN ( SELECT MAX(id) FROM revisiontable GROUP BY companyid )) r ON a.cid=b.r.id";