PostgreSQL新特性分析

134次閱讀

共計 7137 個字符，預計需要花費 18 分鐘才能閱讀完成。

這篇文章主要介紹“PostgreSQL 新特性分析”，在日常操作中，相信很多人在 PostgreSQL 新特性分析問題上存在疑惑，丸趣 TV 小編查閱了各式資料，整理出簡單好用的操作方法，希望對大家解答”PostgreSQL 新特性分析”的疑惑有所幫助！接下來，請跟著丸趣 TV 小編一起來學習吧！

在 PG 8.4 ~ PG 11,PG 會把 WITH 中的查詢視為”optimization fence”(優化圍欄, 與 WITH 外的查詢隔離, 獨立優化), 也就意味著謂詞下推等優化手段無法應用到 WITH 子句中, 考慮到 CTE 在大多數情況下是為了增強可讀性而存在, 因此在 PG 12 中, 滿足以下三個條件的, 優化器將不會對 CTE”視而不見”而是執行”積極的”優化.
A. 遞歸查詢
B. 沒有任何副作用(side effect)
C. 僅在查詢的后續部分引用一次

謂詞下推
測試腳本:

drop table if exists t_w1;
drop table if exists t_w2;
drop table if exists t_w3;
create table t_w1(id int ,c1 varchar(20));
create table t_w2(id int ,c1 varchar(20));
create table t_w3(id int ,c1 varchar(20));
insert into t_w1 select x,x||  from generate_series(1,10000) as x;
insert into t_w2 select x/2,(x/2)||  from generate_series(1,10000) as x;
insert into t_w3 select x,x||  from generate_series(1,10000) as x;

查詢語句:

WITH t1 AS ( SELECT * FROM t_w1 WHERE t_w1.id % 4 = 0 ) 
SELECT * FROM t1 
 JOIN t_w2 as t2 
 ON t2.id = t1.id
 AND t1.id   100;

在 PG 11 中, 其執行計劃如下:

version 
--------------------------------------------------------------------------------------------
 PostgreSQL 11.2 on x86_XX-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), XX-bit
(1 row)
testdb=# explain analyze WITH t1 AS ( SELECT * FROM t_w1 WHERE t_w1.id % 4 = 0 ) 
testdb-# SELECT * FROM t1 
testdb-# JOIN t_w2 as t2 
testdb-# ON t2.id = t1.id 
testdb-# AND t1.id   100;
 QUERY PLAN 
--------------------------------------------------------------------------------------------
 Hash Join (cost=205.34..396.18 rows=34 width=70) (actual time=8.576..11.187 rows=48 loops=1)
 Hash Cond: (t2.id = t1.id)
 CTE t1
 -  Seq Scan on t_w1 (cost=0.00..204.00 rows=50 width=8) (actual time=0.029..6.074 rows=2500 loops=1)
 Filter: ((id % 4) = 0)
 Rows Removed by Filter: 7500
 -  Seq Scan on t_w2 t2 (cost=0.00..153.00 rows=10000 width=8) (actual time=0.030..1.166 rows=10000 loops=1)
 -  Hash (cost=1.12..1.12 rows=17 width=62) (actual time=8.536..8.536 rows=24 loops=1)
 Buckets: 1024 Batches: 1 Memory Usage: 9kB
 -  CTE Scan on t1 (cost=0.00..1.12 rows=17 width=62) (actual time=0.033..8.521 rows=24 loops=1)
 Filter: (id   100)
 Rows Removed by Filter: 2476
 Planning Time: 1.913 ms
 Execution Time: 11.357 ms
(14 rows)

在 PG 12 中, 其執行計劃如下:

testdb=# select version();
 version 
--------------------------------------------------------------------------------------------
 PostgreSQL 12beta1 on x86_XX-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), XX-bit
(1 row)
testdb=# explain analyze WITH t1 AS ( SELECT * FROM t_w1 WHERE t_w1.id % 4 = 0 ) 
testdb-# SELECT * FROM t1 
testdb-# JOIN t_w2 as t2 
testdb-# ON t2.id = t1.id 
testdb-# AND t1.id   100;
 QUERY PLAN 
--------------------------------------------------------------------------------------------
 Hash Join (cost=229.01..419.52 rows=1 width=16) (actual time=6.974..17.156 rows=48 loops=1)
 Hash Cond: (t2.id = t_w1.id)
 -  Seq Scan on t_w2 t2 (cost=0.00..153.00 rows=10000 width=8) (actual time=0.076..5.205 rows=10000 loops=1)
 -  Hash (cost=229.00..229.00 rows=1 width=8) (actual time=6.882..6.882 rows=24 loops=1)
 Buckets: 1024 Batches: 1 Memory Usage: 9kB
 -  Seq Scan on t_w1 (cost=0.00..229.00 rows=1 width=8) (actual time=0.077..6.842 rows=24 loops=1)
 Filter: ((id   100) AND ((id % 4) = 0))
 Rows Removed by Filter: 9976
 Planning Time: 1.677 ms
 Execution Time: 17.244 ms
(10 rows)

可以看到, 在 PG 11 中, 謂詞 (id 100) 不會下推 CTE 中, 但在 PG 12 中, 優化器則把謂詞下推到 CTE 中(Filter: ((id 100) AND ((id % 4) = 0))).

New Option
如果希望 12 的優化器行為與先前的一樣, 則加入 Option : MATERIALIZED.

testdb=# explain analyze WITH t1 AS MATERIALIZED( SELECT * FROM t_w1 WHERE t_w1.id % 4 = 0 ) 
SELECT * FROM t1 
 JOIN t_w2 as t2 
 ON t2.id = t1.id 
 AND t1.id   100;
 QUERY PLAN 
-------------------------------------------------------------------------------------------
 Hash Join (cost=205.34..396.18 rows=34 width=70) (actual time=30.705..48.549 rows=48 loops=1)
 Hash Cond: (t2.id = t1.id)
 CTE t1
 -  Seq Scan on t_w1 (cost=0.00..204.00 rows=50 width=8) (actual time=0.152..21.274 rows=2500 loops=1)
 Filter: ((id % 4) = 0)
 Rows Removed by Filter: 7500
 -  Seq Scan on t_w2 t2 (cost=0.00..153.00 rows=10000 width=8) (actual time=0.154..8.582 rows=10000 loops=1)
 -  Hash (cost=1.12..1.12 rows=17 width=62) (actual time=30.502..30.502 rows=24 loops=1)
 Buckets: 1024 Batches: 1 Memory Usage: 9kB
 -  CTE Scan on t1 (cost=0.00..1.12 rows=17 width=62) (actual time=0.168..30.445 rows=24 loops=1)
 Filter: (id   100)
 Rows Removed by Filter: 2476
 Planning Time: 7.673 ms
 Execution Time: 49.284 ms
(14 rows)

如果希望優化器把盡可能的把 CTE 視為內聯查詢進行優化, 則指定 NOT MATERIALIZED Option:
下面的查詢,CTE 被引用多次, 優化器默認會進行 MATERIALIZED, 通過指定 NOT MATERIALIZED 則強制為內聯查詢.

testdb=# explain analyze WITH t1 AS ( SELECT * FROM t_w1 WHERE t_w1.id % 4 = 0 ) 
testdb-# SELECT * FROM t1 
testdb-# JOIN t_w2 as t2 
testdb-# ON t2.id = t1.id
testdb-# UNION ALL
testdb-# select t1.*,NULL,NULL from t1 where t1.id % 3 = 0;
 QUERY PLAN 
-----------------------------------------------------------------------------------------------------------------------
 Append (cost=205.62..399.89 rows=101 width=70) (actual time=11.663..27.725 rows=3332 loops=1)
 CTE t1
 -  Seq Scan on t_w1 (cost=0.00..204.00 rows=50 width=8) (actual time=0.032..7.300 rows=2500 loops=1)
 Filter: ((id % 4) = 0)
 Rows Removed by Filter: 7500
 -  Hash Join (cost=1.62..193.12 rows=100 width=70) (actual time=11.662..24.094 rows=2499 loops=1)
 Hash Cond: (t2.id = t1.id)
 -  Seq Scan on t_w2 t2 (cost=0.00..153.00 rows=10000 width=8) (actual time=0.033..4.412 rows=10000 loops=1)
 -  Hash (cost=1.00..1.00 rows=50 width=62) (actual time=11.611..11.612 rows=2500 loops=1)
 Buckets: 4096 (originally 1024) Batches: 1 (originally 1) Memory Usage: 132kB
 -  CTE Scan on t1 (cost=0.00..1.00 rows=50 width=62) (actual time=0.035..9.916 rows=2500 loops=1)
 -  CTE Scan on t1 t1_1 (cost=0.00..1.25 rows=1 width=98) (actual time=0.008..2.824 rows=833 loops=1)
 Filter: ((id % 3) = 0)
 Rows Removed by Filter: 1667
 Planning Time: 2.358 ms
 Execution Time: 28.746 ms
(16 rows)

使用 NOT MATERIALIZED 選項

testdb=# explain analyze WITH t1 AS NOT MATERIALIZED( SELECT * FROM t_w1 WHERE t_w1.id % 4 = 0 ) 
SELECT * FROM t1 
 JOIN t_w2 as t2 
 ON t2.id = t1.id
UNION ALL
select t1.*,NULL,NULL from t1 where t1.id % 3 = 0;
 QUERY PLAN 
-------------------------------------------------------------------------------------------
 Append (cost=204.62..650.39 rows=51 width=17) (actual time=27.894..57.453 rows=3332 loops=1)
 -  Hash Join (cost=204.62..395.62 rows=50 width=16) (actual time=27.892..48.911 rows=2499 loops=1)
 Hash Cond: (t2.id = t_w1.id)
 -  Seq Scan on t_w2 t2 (cost=0.00..153.00 rows=10000 width=8) (actual time=0.149..7.606 rows=10000 loops=1)
 -  Hash (cost=204.00..204.00 rows=50 width=8) (actual time=27.699..27.699 rows=2500 loops=1)
 Buckets: 4096 (originally 1024) Batches: 1 (originally 1) Memory Usage: 132kB
 -  Seq Scan on t_w1 (cost=0.00..204.00 rows=50 width=8) (actual time=0.151..22.446 rows=2500 loops=1)
 Filter: ((id % 4) = 0)
 Rows Removed by Filter: 7500
 -  Seq Scan on t_w1 t_w1_1 (cost=0.00..254.00 rows=1 width=44) (actual time=0.038..7.400 rows=833 loops=1)
 Filter: (((id % 4) = 0) AND ((id % 3) = 0))
 Rows Removed by Filter: 9167
 Planning Time: 12.357 ms
 Execution Time: 58.490 ms
(14 rows)

到此，關于“PostgreSQL 新特性分析”的學習就結束了，希望能夠解決大家的疑惑。理論與實踐的搭配能更好的幫助大家學習，快去試試吧！若想繼續學習更多相關知識，請繼續關注丸趣 TV 網站，丸趣 TV 小編會繼續努力為大家帶來更多實用的文章！

正文完