Filter jsonb results in subselect
我正在从几个表构建一个分层JSON结果。这些仅仅是示例,但应该足以满足本演示的目的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | CREATE TABLE book ( id INTEGER PRIMARY KEY NOT NULL, DATA JSONB ); CREATE TABLE author ( id INTEGER PRIMARY KEY NOT NULL, DATA JSONB ); CREATE TABLE book_author ( id INTEGER PRIMARY KEY NOT NULL, author_id INTEGER, book_id INTEGER ); CREATE UNIQUE INDEX pk_unique ON book_author (author_id, book_id); |
测试数据:
1 2 3 4 5 6 7 8 9 10 11 12 | INSERT INTO book (id, DATA) VALUES (1, '{"pages": 432,"title":"2001: A Space Odyssey"}') , (2, '{"pages": 300,"title":"The City And The City"}') , (3, '{"pages": 143,"title":"Unknown Book"}'); INSERT INTO author (id, DATA) VALUES (1, '{"age": 90,"name":"Arthur C. Clarke"}') , (2, '{"age": 43,"name":"China Miéville"}'); INSERT INTO book_author (id, author_id, book_id) VALUES (1, 1, 1) , (2, 1, 2); |
我创建了以下功能:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | CREATE OR REPLACE FUNCTION public.book_get() RETURNS json AS $BODY$ DECLARE RESULT json; BEGIN SELECT to_json(array_agg(_b)) INTO RESULT FROM ( SELECT book.id id, book.data->>'title' title, book.data->>'pages' pages, ( SELECT to_json(array_agg(_a)) FROM ( SELECT author.id id, author.data->>'name'"name", author.data->>'age' age FROM author, book_author ba WHERE ba.author_id = author.id AND ba.book_id = book.id ORDER BY id ) _a ) authors FROM book ORDER BY id ASC ) _b; RETURN RESULT; END; $BODY$ LANGUAGE plpgsql VOLATILE; |
执行函数
1 | SELECT book_get(); |
产生以下结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | [ { "id":1, "title":"2001: A Space Odyssey", "pages":432, "authors":[ { "id":1, "name":"Arthur C. Clarke", "age":90 } ] }, { "id":2, "title":"The City And The City", "pages":300, "authors":[ { "id":2, "name":"China Miéville", "age":43 } ] }, { "id":3, "title":"Unknown Book", "pages":143, "authors":NULL } ] |
现在,我可以使用
1 2 3 4 5 6 7 8 9 10 | SELECT to_json(array_agg(_b)) INTO RESULT FROM ( ... ) _b -- give me the book with id 1 WHERE _b.id = 1; -- or give me all titles with the occurrence of 'City' anywhere WHERE _b.title LIKE '%City%'; -- or has more than 200 pages WHERE _b.pages > 200; |
如何才能过滤
我完全不知道
访问
使用JSON运算符
1
2 operator does NOT exist: record -> json
Hint: No operator matches the given name AND argument TYPE(s). You might need TO ADD explicit TYPE casts.
我记得在
1 2 | GROUP BY _b.authors HAVING _b.authors->>'name' = 'Arthur C. Clarke'; |
但它给了我错误:
ERROR: could not identify an equality operator for type json
为了使它更清楚:
1 2 3 4 5 | SELECT to_json(array_agg(_b)) INTO RESULT FROM ( ... ) _b WHERE _b.authors->0->>'name' = 'Arthur C. Clarke'; |
基本上会做我需要的,只有当索引
更新2
好的,再次阅读文档......似乎我没有从Postgres文档中得到JSON和JSONB关于函数的区别,我认为它只涉及数据类型。用
更新3
@ErwinBrandstetter:有道理。我还不知道LATERAL,很高兴知道它存在。我掌握了JSON / JSONB的功能和操作符,现在对我很有意义。我不清楚的是在
如果我需要使用
1 2 3 4 5 6 7 | SELECT * FROM jsonb_array_elements('[ {"age": 90,"name":"the Arthur C. Clarke"}, {"age": 43,"name":"China Miéville"}, {"age": null,"name":"Erwin the Brandstetter"} ]'::jsonb) author WHERE author->>'name' LIKE '%the%'; |
并获得所需的结果,
1 2 | 1: {"age": 90,"name":"the Arthur C. Clarke"} 2: {"age": NULL,"name":"Erwin the Brandstetter"} |
但是在我的例子的最后(最后)
更新4
当然在
1 2 3 4 5 6 | SELECT json_agg(_b) INTO RESULT FROM ( ... ) _b, jsonb_array_elements(_b.authors) AS arrauthors WHERE arrauthors->>'name' LIKE 'Arthur %'; |
将为所有书籍提供以"亚瑟"开头的作者姓名。我仍然感谢对此方法的评论或更新。
How would I make it possible to filter on authors? E.g. something
equivalent toWHERE _b.authors.'name' = 'Arthur C. Clarke' .
通过
基本功能
您的基本功能可以更简单:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | CREATE OR REPLACE FUNCTION public.book_get() RETURNS jsonb AS $func$ SELECT jsonb_agg(books) FROM ( SELECT b.data || jsonb_build_object('id', b.id, 'authors', a.authors) AS books FROM book b LEFT JOIN ( -- LEFT JOIN to include books without authors SELECT book_id, jsonb_agg(data_plus) AS authors FROM ( SELECT ba.book_id, jsonb_set(a.data, '{id}', to_jsonb(a.id)) AS data_plus FROM book_author ba JOIN author a ON a.id = ba.author_id ORDER BY ba.book_id, ba.author_id ) a0 GROUP BY 1 ) a ON a.book_id = b.id ORDER BY b.id ) b0 $func$ LANGUAGE SQL STABLE; |
主要观点
- 使它成为SQL,更简单。不需要plpgsql。
-
设为
STABLE 。 -
不要省略列别名的关键字
AS 。 -
使用
jsonb_agg() -
如果您只想将
id 列添加为data 的键,则有更简单的方法: -
使用Postgres 9.5中的新
jsonb_set() :1jsonb_set(DATA, '{id}', to_jsonb(id)) -
连接两个
jsonb 值:1b.data || jsonb_build_object('id', b.id, 'authors', a.authors) - 在SQL中返回JSON对象数组(Postgres)
这会添加对象或使用相同的键更新现有对象的值 - 相当于SQL中的UPSERT。您也可以将操作限制为仅更新,请参阅手册。
我在内部子查询中使用它来添加单个键。
同样,左侧值中相同级别的现有键将被右侧值中的键替换。我用
我在外部子查询中使用它,更简单地添加多个键。 (并展示两种选择。
您的原始查询将所有值转换为
测试结果
要测试作者存在的函数结果:
1 | SELECT public.book_get() @> '[{"authors": [{"name":"Arthur C. Clarke"}]}]'; |
您已匹配模式中的JSON结构。它只适用于完全匹配。
或者您可以使用
这两种方法都很昂贵,因为您在从三个整个表构建JSON文档后进行测试。
首先过滤
要实际过滤具有(可能是其他!)给定作者的书籍,请调整您的基础查询。你要求过滤那些......
have an author with a middle name 'C.' or a first name 'Arthur'.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | SELECT jsonb_agg(b.data || jsonb_build_object('id', b.id, 'authors', a.authors) ORDER BY b.id) AS books FROM book b , LATERAL ( -- CROSS JOIN since we filter before the join SELECT jsonb_agg(jsonb_set(a.data, '{id}', to_jsonb(a.id)) ORDER BY a.id) AS authors FROM book_author ba JOIN author a ON a.id = ba.author_id WHERE ba.book_id = b.id ) a WHERE EXISTS ( SELECT 1 -- one of the authors matches FROM book_author ba JOIN author a ON a.id = ba.author_id WHERE ba.book_id = b.id AND (a.data->>'name' LIKE '% C. %' OR -- middle name 'C.' a.data->>'name' LIKE 'Arthur %') -- or a first name 'Arthur'. ); |
在构建结果之前过滤至少具有一个匹配作者的书籍。
请注意我如何使用
- PostgreSQL:在group by子句中连接数组
- 如何获得Postgres过去12周的发票数量
如果您的表很大并且您需要快速查询,请使用索引!对于这个特定的查询,像这样的函数trigram GIN索引应该对大表有奇效:
1 | CREATE INDEX author_special_idx ON author USING gin ((data->>'name') gin_trgm_ops); |
详细说明/说明:
- jsonb键/值上的模式匹配
- 用于在JSON数组中查找元素的索引
建议在postgresql中使用JSOn的精彩教程。如果您以这种方式创建数据:
1 2 3 4 5 6 7 8 | CREATE TABLE json_test ( id serial PRIMARY KEY, DATA jsonb ); INSERT INTO json_test (DATA) VALUES ('{"id":1,"title":"2001: A Space Odyssey","pages":432,"authors":[{"id":1,"fullname":"Arthur C. Clarke"}]}'), ('{"id":2,"title":"The City And The City","pages":300,"authors":[{"id":2,"fullname":"China Miéville"}]}'), ('{"id":3,"title":"Unknown Book","pages":143,"authors":null}'); |
您可以选择具有特定ID
1 2 | SELECT * FROM json_test WHERE DATA @> '{"id":2}'; |
或者在子数组中查找特定名称:
1 2 | SELECT * FROM json_test WHERE DATA -> 'authors' @> '[{"fullname":"Arthur C. Clarke"}]' |
或者找到超过200页的书:
1 2 | SELECT * FROM json_test WHERE (DATA -> 'pages')::text::INT > 200 |