使用HiveSQL实现开始到结束日期之间的所有日期

Hivesql实现日期间所有日期
或 返回所有日期
如给定起止日期返回中间所有日期

1
(posexplode比 explode多返回一个postion,利用postion进行下一步的关联操作)

给定表 user 如下:

1
2
uid start_date  end_date
1   2020-01-01  2020-01-05

查询1:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
set hive.exec.mode.local.auto=true; --开启本地模式
set hive.cli.print.header=true; --打印表.列名

select
    tmp.*,
    t.*,
    date_add(start_date, pos) as mid_date
from(
    select '1' as uid,
    '2020-01-01' as start_date,
    '2020-01-05' as end_date
)tmp
lateral view posexplode( split( space( datediff( end_date, start_date ) ), '' ) ) t as pos, val
;

查询1的结果:

1
2
3
4
5
6
7
tmp.uid tmp.start_date  tmp.end_date    t.pos   t.val   mid_date
1   2020-01-01  2020-01-05  0       2020-01-01
1   2020-01-01  2020-01-05  1       2020-01-02
1   2020-01-01  2020-01-05  2       2020-01-03
1   2020-01-01  2020-01-05  3       2020-01-04
1   2020-01-01  2020-01-05  4       2020-01-05
Time taken: 0.086 seconds, Fetched: 5 row(s)

– 查询2:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
select
    tmp.*,
    t.*,
    date_add(start_date, pos) as mid_date
from(
    select '1' as uid,
        '2020-01-01' as start_date,
        '2020-01-05' as end_date

    union all
    select '1' as uid,
        '2020-01-01' as start_date,
        '2020-01-25' as end_date
)tmp
lateral view posexplode( split( repeat( 'a', datediff( '2020-01-05', '2020-01-01' )), '' ) ) t as pos, val
;

查询2的结果:

1
2
3
4
5
6
7
8
9
10
11
12
tmp.uid tmp.start_date  tmp.end_date    t.pos   t.val   mid_date
1   2020-01-01  2020-01-05  0   a   2020-01-01
1   2020-01-01  2020-01-05  1   a   2020-01-02
1   2020-01-01  2020-01-05  2   a   2020-01-03
1   2020-01-01  2020-01-05  3   a   2020-01-04
1   2020-01-01  2020-01-05  4       2020-01-05
1   2020-01-01  2020-01-25  0   a   2020-01-01
1   2020-01-01  2020-01-25  1   a   2020-01-02
1   2020-01-01  2020-01-25  2   a   2020-01-03
1   2020-01-01  2020-01-25  3   a   2020-01-04
1   2020-01-01  2020-01-25  4       2020-01-05
Time taken: 1.335 seconds, Fetched: 10 row(s)