如果SQL sum()达到类型容量(溢出)会发生什么?

What happens if SQL sum() reach type capacity (overflow)?

我理解这个问题是依赖于供应商的,但我要问的是,如果像SUM这样的聚合函数在小型类型上运行,我应该担心吗?

例如,mariadb对INT类型使用4个字节。开发人员可能会假设每个事务的数量不超过几千个。

但是,如果我们试图为所有部门获得一整年的收入,会发生什么呢?例如。:

1
2
-- CREATE TABLE income (dt DATETIME, department INT, amount INT);
SELECT SUM(amount) FROM income WHERE dt BETWEEN '2014-01-01' AND '2014-12-31'

增加存储大小只是为了解决聚合函数SUM的溢出问题,这看起来很愚蠢。

我该担心什么?SQL 92/99/2008标准是否有任何保证或澄清?

JDBC驱动程序有什么特别的支持吗?

我是否应该改写选择格式:

1
2
SELECT SUM(CAST(amount AS BIGINT)) FROM income
  WHERE dt BETWEEN '2014-01-01' AND '2014-12-31'


在MySQL上测试相当容易:

32位溢出:

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> SELECT SUM(x) FROM (
    SELECT pow(2,31) AS x
    UNION ALL
    SELECT pow(2,31)
    UNION ALL
    SELECT pow(2,31)
) AS bignums;
+------------+
| SUM(x)     |
+------------+
| 6442450944 | // returned AS a"bigint"
+------------+
1 ROW IN SET (0.00 sec)

64位:

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> SELECT SUM(x) FROM (
    SELECT pow(2,63) AS x
    UNION ALL
    SELECT pow(2,63)
    UNION ALL
    SELECT pow(2,63)
) AS bignums;
+-----------------------+
| SUM(x)                |
+-----------------------+
| 2.7670116110564327e19 | // returned AS FLOAT
+-----------------------+
1 ROW IN SET (0.00 sec)

双:

1
2
3
4
5
6
7
8
9
10
11
12
mysql> SELECT SUM(x) FROM (
    SELECT 1.7e+308 AS x
    UNION ALL
    SELECT 1.7e+308
    UNION ALL
    SELECT 1.7e+308
) AS bignums;
+--------+
| SUM(x) |
+--------+
|      0 |
+--------+

在MySQL上测试相当容易:

32位溢出:

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> SELECT SUM(x) FROM (
    SELECT pow(2,31) AS x
    UNION ALL
    SELECT pow(2,31)
    UNION ALL
    SELECT pow(2,31)
) AS bignums;
+------------+
| SUM(x)     |
+------------+
| 6442450944 | // returned AS a"bigint"
+------------+
1 ROW IN SET (0.00 sec)

64位:

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> SELECT SUM(x) FROM (
    SELECT pow(2,63) AS x
    UNION ALL
    SELECT pow(2,63)
    UNION ALL
    SELECT pow(2,63)
) AS bignums;
+-----------------------+
| SUM(x)                |
+-----------------------+
| 2.7670116110564327e19 | // returned AS FLOAT
+-----------------------+
1 ROW IN SET (0.00 sec)

双:

1
2
3
4
5
6
7
8
9
10
11
12
mysql> SELECT SUM(x) FROM (
    SELECT 1.7e+308 AS x
    UNION ALL
    SELECT 1.7e+308
    UNION ALL
    SELECT 1.7e+308
) AS bignums;
+--------+
| SUM(x) |
+--------+
|      0 |
+--------+

评论跟进:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
mysql> DESCRIBE overflow
    -> ;
+-------+------------+------+-----+---------+-------+
| FIELD | TYPE       | NULL | KEY | DEFAULT | Extra |
+-------+------------+------+-----+---------+-------+
| x     | INT(11)    | YES  |     | NULL    |       |
| y     | BIGINT(20) | YES  |     | NULL    |       |
| z     | DOUBLE     | YES  |     | NULL    |       |
+-------+------------+------+-----+---------+-------+
3 ROWS IN SET (0.00 sec)

mysql> SELECT * FROM overflow;
+------------+---------------------+---------+
| x          | y                   | z       |
+------------+---------------------+---------+
| 2147483647 | 9223372036854775807 | 1.7e308 |
| 2147483647 | 9223372036854775807 | 1.7e308 |
| 2147483647 | 9223372036854775807 | 1.7e308 |
+------------+---------------------+---------+
3 ROWS IN SET (0.00 sec)

mysql> SELECT SUM(x), SUM(y), SUM(z) FROM overflow;
+------------+----------------------+--------+
| SUM(x)     | SUM(y)               | SUM(z) |
+------------+----------------------+--------+
| 6442450941 | 27670116110564327421 |      0 |
+------------+----------------------+--------+
1 ROW IN SET (0.00 sec)


Postgres在不溢出或截断的情况下处理此问题:

从手册中:

sum(expression), Return Type: bigint for smallint or int arguments, numeric for bigint arguments, otherwise the same as the argument data type

http://www.postgresql.org/docs/current/static/functions-aggregate.html

快速测试证明:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
psql (9.4.5)
TYPE"help" FOR help.

postgres=> CREATE TABLE x (amount INT);
CREATE TABLE
postgres=>
postgres=> INSERT INTO x VALUES (2147483647), (2147483647);
INSERT 0 2
postgres=> SELECT SUM(amount)
wbtest-> FROM x;
    SUM
------------
 4294967294
(1 ROW)

postgres=>

有趣的是,SQL标准要求语句在这种情况下失败:

If, during the computation of the result of AF, an intermediate result is not representable in the declared type of the site that contains that intermediate result, then

...
Otherwise, an exception condition is raised: data exception — numeric value out of range.

(af=聚合函数)


当我对你的理解正确时,你在问万一溢出会发生什么。

至少对于SQL Server,请查阅以下文档:

https://msdn.microsoft.com/de-de/library/ms187810%28v=sql.120%29.aspx

这里说明了sum()的返回类型对于特定的输入类型是什么:

1
2
3
4
5
6
7
8
9
Expression RESULT               RETURN TYPE
------------------------------------------------
tinyint                         INT
SMALLINT                        INT
INT                             INT
BIGINT                          BIGINT
DECIMAL category (p, s)         DECIMAL(38, s)
money AND smallmoney category   money
FLOAT AND REAL category         FLOAT

这意味着,确实可能存在溢出。因此,我建议您使用floatmoney类型的工资,而不是int类型。