关于SQL:如何在MySQL中进行完全外部联接FULL OUTER JOIN?

How to do a FULL OUTER JOIN in MySQL?

我想在MySQL中做一个完整的外部连接。这有可能吗?mysql支持完全外部连接吗?


你在MySQL上没有完全的连接,但是你可以仿效它们。

对于从这个问题中转录的代码示例,您有:

有两张表T1、T2:

1
2
3
4
5
SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id

上面的查询适用于完全外部联接操作不会产生任何重复行的特殊情况。上面的查询依赖于UNIONset运算符来删除查询模式引入的重复行。我们可以通过对第二个查询使用反连接模式来避免引入重复的行,然后使用union all set操作符来组合这两个集合。在更一般的情况下,如果完整的外部联接将返回重复的行,我们可以这样做:

1
2
3
4
5
6
SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION ALL
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id
WHERE t1.id IS NULL


Pablo Santa Cruz给出的答案是正确的;但是,如果有人无意中看到这个页面,想要更多的解释,这里有一个详细的细目。

示例表

假设我们有以下表格:

1
2
3
4
5
6
7
8
9
-- t1
id  name
1   Tim
2   Marta

-- t2
id  name
1   Tim
3   Katarina

内连接

内部连接,如:

1
2
3
SELECT *
FROM `t1`
INNER JOIN `t2` ON `t1`.`id` = `t2`.`id`;

将只获取出现在两个表中的记录,如下所示:

1
1 Tim  1 Tim

内部连接没有方向(如左或右),因为它们是显式双向的-我们需要在两侧匹配。

外连接

另一方面,外部联接用于查找在另一个表中可能不匹配的记录。因此,您必须指定允许联接的哪一侧有丢失的记录。

LEFT JOINRIGHT JOINLEFT OUTER JOINRIGHT OUTER JOIN的简写,下面我将用它们的全名来加强外部连接和内部连接的概念。

左外连接

左外部连接,如:

1
2
3
SELECT *
FROM `t1`
LEFT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`;

…将从左表中获取所有记录,不管它们是否在右表中匹配,如下所示:

1
2
1 Tim   1    Tim
2 Marta NULL NULL

右外部联接

右外部连接,如:

1
2
3
SELECT *
FROM `t1`
RIGHT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`;

…将从右表中获取所有记录,不管它们是否在左表中匹配,如下所示:

1
2
1    Tim   1  Tim
NULL NULL  3  Katarina

全外部连接

完整的外部联接将为我们提供来自两个表的所有记录,无论它们在另一个表中是否有匹配项,在没有匹配项的两侧都有空值。结果如下:

1
2
3
1    Tim   1    Tim
2    Marta NULL NULL
NULL NULL  3    Katarina

但是,正如pablo-santa-cruz指出的那样,mysql不支持这个。我们可以通过执行左联接和右联接的联合来模拟它,如下所示:

1
2
3
4
5
6
7
8
9
SELECT *
FROM `t1`
LEFT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`

UNION

SELECT *
FROM `t1`
RIGHT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`;

您可以将UNION看作是"运行这两个查询,然后将结果堆叠在一起";其中一些行来自第一个查询,另一些来自第二个查询。

需要注意的是,MySQL中的UNION将消除精确的重复:tim将出现在这两个查询中,但UNION的结果只列出了他一次。我的数据库专家同事认为这种行为不应该被依赖。为了更明确地说明这一点,我们可以在第二个查询中添加一个WHERE子句:

1
2
3
4
5
6
7
8
9
10
SELECT *
FROM `t1`
LEFT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`

UNION

SELECT *
FROM `t1`
RIGHT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`
WHERE `t1`.`id` IS NULL;

另一方面,如果您出于某种原因想要查看副本,可以使用UNION ALL


使用UNION查询将删除重复项,这与从不删除任何重复项的full outer join的行为不同:

1
2
3
4
5
6
7
[TABLE: t1]                            [TABLE: t2]
VALUE                                  VALUE
-------                                -------
1                                      1
2                                      2
4                                      2
4                                      5

这是full outer join的预期结果:

1
2
3
4
5
6
7
8
VALUE | VALUE
------+-------
1     | 1
2     | 2
2     | 2
NULL  | 5
4     | NULL
4     | NULL

这是使用leftRIGHT JOINUNION的结果:

1
2
3
4
5
6
VALUE | VALUE
------+-------
NULL  | 5
1     | 1
2     | 2
4     | NULL

[SQL Fiddle]

我建议的问题是:

1
2
3
4
5
6
7
8
9
10
11
12
13
SELECT
    t1.value, t2.value
FROM t1
LEFT OUTER JOIN t2  
  ON t1.value = t2.value
UNION ALL      -- Using `union all` instead of `union`
SELECT
    t1.value, t2.value
FROM t2
LEFT OUTER JOIN t1
  ON t1.value = t2.value
WHERE
    t1.value IS NULL

上述查询结果与预期结果相同:

1
2
3
4
5
6
7
8
VALUE | VALUE
------+-------
1     | 1
2     | 2
2     | 2
4     | NULL
4     | NULL
NULL  | 5

[SQL Fiddle]

@Steve Chambers: [From comments, with many thanks!]
Note: This may be the best solution, both for efficiency and for generating the same results as a FULL OUTER JOIN. This blog post also explains it well - to quote from Method 2:"This handles duplicate rows correctly and doesn’t include anything it shouldn’t. It’s necessary to use UNION ALL instead of plain UNION, which would eliminate the duplicates I want to keep. This may be significantly more efficient on large result sets, since there’s no need to sort and remove duplicates."

我决定添加来自full outer join可视化和数学的另一个解决方案,它不是比上面的更好,而是更具可读性:

Full outer join means (t1 ∪ t2): all in t1 or in t2
(t1 ∪ t2) = (t1 ∩ t2) + t1_only + t2_only: all in both t1 and t2 plus all in t1 that aren't in t2 and plus all in t2 that aren't in t1:

1
2
3
4
5
6
7
8
9
10
11
12
13
-- (t1 ∩ t2): all in both t1 and t2
SELECT t1.value, t2.value
FROM t1 JOIN t2 ON t1.value = t2.value    
UNION ALL  -- And plus
-- all in t1 that not exists in t2
SELECT t1.value, NULL
FROM t1
WHERE NOT EXISTS( SELECT 1 FROM t2 WHERE t2.value = t1.value)    
UNION ALL  -- and plus
-- all in t2 that not exists in t1
SELECT NULL, t2.value
FROM t2
WHERE NOT EXISTS( SELECT 1 FROM t1 WHERE t2.value = t1.value)

[SQL Fiddle]


MySQL没有完全外部联接语法。您必须通过执行左连接和右连接来模拟,如下所示-

1
2
3
4
5
SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id  
UNION
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id

但是MySQL也没有正确的连接语法。根据mysql的外部连接简化,通过切换查询中FROMON子句中的T1和T2,将右连接转换为等价的左连接。因此,mysql查询优化器将原始查询转换为以下内容-

1
2
3
4
5
SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id  
UNION
SELECT * FROM t2
LEFT JOIN t1 ON t2.id = t1.id

现在,编写原始查询没有什么坏处,但是假设您有像WHERE子句这样的谓词,它是before join谓词,或者是ON子句上的and谓词,它是during join谓词,那么您可能想看看魔鬼;具体来说。

如果谓词被空拒绝,MySQL查询优化器会定期检查它们。Null-Rejected Definition and Examples现在,如果您已经完成了正确的连接,但是使用了T1列上的WHERE谓词,那么您可能会遇到一个被空拒绝的场景。

例如,以下查询-

1
2
3
4
5
6
7
SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE t1.col1 = 'someValue'
UNION
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id
WHERE t1.col1 = 'someValue'

由查询优化器转换为以下内容-

1
2
3
4
5
6
7
SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE t1.col1 = 'someValue'
UNION
SELECT * FROM t2
LEFT JOIN t1 ON t2.id = t1.id
WHERE t1.col1 = 'someValue'

所以表的顺序已经改变了,但是谓词仍然应用于T1,但是T1现在在"on"子句中。如果T1.col1被定义为NOT NULL。列,则此查询将被空拒绝。

任何被拒绝为空的外部联接(左、右、满)都将被MySQL转换为内部联接。

因此,您可能期望的结果可能与MySQL返回的结果完全不同。你可能认为这是MySQL正确连接的一个bug,但这不正确。这就是MySQL查询优化器的工作原理。因此,负责开发的开发人员在构建查询时必须注意这些细微差别。


上面的答案实际上都不正确,因为当存在重复值时,它们不遵循语义。

对于查询,例如(来自此副本):

1
SELECT * FROM t1 FULL OUTER JOIN t2 ON t1.Name = t2.Name;

正确的等效值为:

1
2
3
4
5
6
7
8
SELECT t1.*, t2.*
FROM (SELECT name FROM t1 UNION  -- This is intentionally UNION to remove duplicates
      SELECT name FROM t2
     ) n LEFT JOIN
     t1
     ON t1.name = n.name LEFT JOIN
     t2
     ON t2.name = n.name;

如果需要使用NULL值(也可能是必需的),则使用NULL安全比较运算符<=>而不是=


在sqlite中,您应该这样做:

1
2
3
4
5
6
7
SELECT *
FROM leftTable lt
LEFT JOIN rightTable rt ON lt.id = rt.lrid
UNION
SELECT lt.*, rl.*  -- To match column set
FROM rightTable rt
LEFT JOIN  leftTable lt ON lt.id = rt.lrid


修改了sha.t的查询以获得更清晰的信息:

1
2
3
4
5
6
7
8
9
10
-- t1 left join t2
SELECT t1.value, t2.value
FROM t1 LEFT JOIN t2 ON t1.value = t2.value  

    UNION ALL -- include duplicates

-- t1 right exclude join t2 (records found only in t2)
SELECT t1.value, t2.value
FROM t1 RIGHT JOIN t2 ON t1.value = t2.value
WHERE t2.value IS NULL

您可以执行以下操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
(SELECT
    *
FROM
    table1 t1
        LEFT JOIN
    table2 t2 ON t1.id = t2.id
WHERE
    t2.id IS NULL)
UNION ALL
 (SELECT
    *
FROM
    table1 t1
        RIGHT JOIN
    table2 t2 ON t1.id = t2.id
WHERE
    t1.id IS NULL);


您对交叉连接解决方案有何看法?

1
2
3
4
SELECT t1.*, t2.*
FROM table1 t1
INNER JOIN table2 t2
ON 1=1;


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
SELECT
    a.name,
    b.title
FROM
    author AS a
LEFT JOIN
    book AS b
    ON a.id = b.author_id
UNION
SELECT
    a.name,
    b.title
FROM
    author AS a
RIGHT JOIN
    book AS b
    ON a.id = b.author_id

我修复了响应,并且工作包括所有行(基于pavle lekic的响应)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
    (
    SELECT a.* FROM tablea a
    LEFT JOIN tableb b ON a.`key` = b.key
    WHERE b.`key` IS NULL
    )
    UNION ALL
    (
    SELECT a.* FROM tablea a
    LEFT JOIN tableb b ON a.`key` = b.key
    WHERE  a.`key` = b.`key`
    )
    UNION ALL
    (
    SELECT b.* FROM tablea a
    RIGHT JOIN tableb b ON b.`key` = a.key
    WHERE a.`key` IS NULL
    );


这也是可能的,但是您必须在Select中提到相同的字段名。

1
2
3
4
5
SELECT t1.name, t2.name FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION
SELECT t1.name, t2.name FROM t2
LEFT JOIN t1 ON t1.id = t2.id


SQL标准称full join oninner join onunion all行不匹配的左表行,由nulls扩展;union all右表行,由nulls扩展。即inner join onunion all行在left join on而不是inner join onunion all行在right join on而不是inner join on行。

left join onunion allright join on行,不在inner join on中。或者,如果您知道您的inner join on结果在特定的右表列中不能为空,那么"right join on行(不在inner join on中)是right join on中的行,并且该列and扩展了ON条件。

也就是说,right join onunion all适当的left join on行。

"内部联接"和"外部联接"的区别是什么?:

(SQL Standard 2006 SQL/Foundation 7.7 Syntax Rules 1, General Rules 1 b, 3 c & d, 5 b.)


答:

1
SELECT * FROM t1 FULL OUTER JOIN t2 ON t1.id = t2.id;

可以如下重新创建:

1
2
3
4
 SELECT t1.*, t2.*
 FROM (SELECT * FROM t1 UNION SELECT name FROM t2) tmp
 LEFT JOIN t1 ON t1.id = tmp.id
 LEFT JOIN t2 ON t2.id = tmp.id;

使用union或union all答案不包括基表具有重复项的边缘情况。

说明:

有一个边缘情况,一个联合或联合都不能覆盖。我们不能在MySQL上测试它,因为它不支持完整的外部连接,但是我们可以在支持它的数据库上说明这一点:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 WITH cte_t1 AS
 (
??     SELECT 1 AS id1
??     UNION ALL SELECT 2
??     UNION ALL SELECT 5
??     UNION ALL SELECT 6
??     UNION ALL SELECT 6
 ),
cte_t2 AS
(
????  SELECT 3 AS id2
??    UNION ALL SELECT 4
??    UNION ALL SELECT 5
??    UNION ALL SELECT 6
??    UNION ALL SELECT 6
)
SELECT??*  FROM??cte_t1 t1 FULL OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2;

This gives us this answer:

id1??id2
1??NULL
2??NULL
NULL??3
NULL??4
5??5
6??6
6??6
6??6
6??6

联合解决方案:

1
2
3
SELECT??* FROM??cte_t1 t1 LEFT OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2
UNION????
SELECT??* FROM cte_t1 t1 RIGHT OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2

给出错误答案:

1
2
3
4
5
6
7
 id1??id2
NULL??3
NULL??4
1??NULL
2??NULL
5??5
6??6

联合所有解决方案:

1
2
3
SELECT??* FROM cte_t1 t1 LEFT OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2
UNION ALL
SELECT??* FROM??cte_t1 t1 RIGHT OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2

也不正确。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
id1??id2
1??NULL
2??NULL
5??5
6??6
6??6
6??6
6??6
NULL??3
NULL??4
5??5
6??6
6??6
6??6
6??6

鉴于此查询:

1
2
3
4
SELECT t1.*, t2.*
FROM (SELECT * FROM t1 UNION SELECT name FROM t2) tmp
LEFT JOIN t1 ON t1.id = tmp.id
LEFT JOIN t2 ON t2.id = tmp.id;

给出以下内容:

1
2
3
4
5
6
7
8
9
10
id1??id2
1??NULL
2??NULL
NULL??3
NULL??4
5??5
6??6
6??6
6??6
6??6

顺序不同,但与正确答案不符。