What's the proper index for querying structures in arrays in Postgres jsonb?
我正在尝试在Postgres 9.4的Postgres
1 2 3 | [{"event_slug":"test_1","start_time":"2014-10-08","end_time":"2014-10-12"}, {"event_slug":"test_2","start_time":"2013-06-24","end_time":"2013-07-02"}, {"event_slug":"test_3","start_time":"2014-03-26","end_time":"2014-03-30"}] |
我执行的查询如下:
1 2 3 4 5 6 7 8 9 10 11 | SELECT * FROM locations WHERE EXISTS ( SELECT 1 FROM jsonb_array_elements(events) AS e WHERE ( e->>'event_slug' = 'test_1' AND ( e->>'start_time' >= '2014-10-30 14:04:06 -0400' OR e->>'end_time' >= '2014-10-30 14:04:06 -0400' ) ) ) |
我如何为上述查询创建该数据的索引?对于每个列中包含大约10个事件的数百万行,这听起来合理吗?
值得注意的是,我似乎仍在进行序列扫描:
1 | CREATE INDEX events_gin_idx ON some_table USING GIN (events); |
我猜是因为我在查询中做的第一件事就是将数据转换为JSON数组元素。
首先,您不能访问这样的JSON数组值。对于给定的JSON值
1 2 3 | [{"event_slug":"test_1","start_time":"2014-10-08","end_time":"2014-10-12"}, {"event_slug":"test_2","start_time":"2013-06-24","end_time":"2013-07-02"}, {"event_slug":"test_3","start_time":"2014-03-26","end_time":"2014-03-30"}] |
针对第一个数组元素的有效测试将是:
1 | WHERE e->0->>'event_slug' = 'test_1' |
但您可能不想将搜索限制在数组的第一个元素。对于Postgres9.4中的
GIN索引的内置运算符类不支持"大于"或"小于"operators
1 2 3 4 | Name Indexed DATA TYPE Indexable Operators ... jsonb_ops jsonb ? ?& ?| @> jsonb_path_ops jsonb @> |
(默认为
要使用索引支持相等性检查,请执行以下操作:
1 2 3 4 | CREATE INDEX locations_events_gin_idx ON locations USING gin (events jsonb_path_ops); SELECT * FROM locations WHERE events @> '[{"event_slug":"test_1"}]'; |
如果过滤器具有足够的选择性,这可能就足够好了。假设是
1 2 3 4 5 | SELECT l.* FROM locations l , jsonb_array_elements(l.events) e WHERE l.events @> '[{"event_slug":"test_1"}]' AND (e->>'end_time')::TIMESTAMP >= '2014-10-30 14:04:06 -0400'::timestamptz; |
使用隐式
- postgresql unnest(),元素号为
注意不同的数据类型!JSON值中的内容看起来像
- 在Rails和PostgreSQL中完全忽略时区
关于
- PostgreSQL使用jsonb连接
高级解决方案
如果上面的内容不够好,我将考虑一个以规范化形式存储相关属性的
代码假定JSON值的格式与问题中显示的格式一致。
设置:
1 2 3 4 5 6 7 8 9 | CREATE TYPE event_type AS ( , event_slug text , start_time TIMESTAMP , end_time TIMESTAMP ); CREATE MATERIALIZED VIEW loc_event AS SELECT l.location_id, e.event_slug, e.end_time -- start_time not needed FROM locations l, jsonb_populate_recordset(NULL::event_type, l.events) e; |
关于
- 如何将PostgreSQL 9.4的jsonb类型转换为float
1 | CREATE INDEX loc_event_idx ON loc_event (event_slug, end_time, location_id); |
也包括
查询:
1 2 3 4 | SELECT * FROM loc_event WHERE event_slug = 'test_1' AND end_time >= '2014-10-30 14:04:06 -0400'::timestamptz; |
或者,如果需要底层
1 2 3 4 5 6 7 8 | SELECT l.* FROM ( SELECT DISTINCT location_id FROM loc_event WHERE event_slug = 'test_1' AND end_time >= '2014-10-30 14:04:06 -0400'::timestamptz ) le JOIN locations l USING (location_id); |
1 2 | CREATE INDEX json_array_elements_index ON json_array_elements ((events_arr->>'event_slug')); |
应该让你朝着正确的方向开始。