Be careful with CTE in PostgreSQL

Photo by Adrian Trinkaus on Unsplash

The WITH clause

What’s so dangerous?

haki=# create table foo (id int, padding text);CREATE TABLEhaki=# insert into foo (id, padding) select id, md5(random()::text) from generate_series(1, 1000000) as id order by random();INSERT 0 1000000haki=# create index foo_id_ix on foo (id);CREATE INDEXhaki=# analyze foo;ANALYZE
haki=# select * from foo where id = 500000;id     | padding
— — — - — — — — — — — — — — — — — — — — —
500000 | b292eb19f3145fb087648d5956dfa44e
Time: 0.619 mshaki=# with cte as (select * from foo) select * from cte where id = 500000;id | padding
— — — — — — — — — — — — — — — — — — — — —
500000 | b292eb19f3145fb087648d5956dfa44e
Time: 227.675 ms
haki=# explain (analyze on, timing on) select * from foo where id = 500000;QUERY PLAN
— — — — — — — — — — — — — — —
Index Scan using foo_id_ix on foo (cost=0.42..8.44 rows=1 width=37) (actual time=0.026..0.028 rows=1 loops=1)
Index Cond: (id = 500000)
Execution time: 0.060 ms
haki=# explain (analyze on, timing on) with cte as (select * from foo) select * from cte where id = 500000;QUERY PLAN
------------------------------
CTE Scan on cte (cost=18334.00..40834.00 rows=5000 width=36) (actual time=3.243..269.290 rows=1 loops=1)
Filter: (id = 500000)
Rows Removed by Filter: 999999
CTE cte
-> Seq Scan on foo (cost=0.00..18334.00 rows=1000000 width=37) (actual time=0.029..77.078 rows=1000000 loops=1)
Execution time: 276.625 ms
haki=# explain (analyze on, timing on) select * from (select * from foo) as subquery where id = 500000;QUERY PLAN
------------------------------
Index Scan using foo_id_ix on foo (cost=0.42..8.44 rows=1 width=37) (actual time=0.028..0.031 rows=1 loops=1)
Index Cond: (id = 500000)
Execution time: 0.066 ms

It is possible to prevent PostgreSQL from materializing a CTE?

> SELECT * FROM foo WHERE id = 500000;-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 |
| 1 | TABLE ACCESS BY INDEX ROWID | FOO | 1 | 21 |
| * 2 | INDEX RANGE SCAN | FOO_ID_IX | 37 | 1 |
-----------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("ID"=500000)
> WITH cte AS (
SELECT * FROM foo
)
SELECT * FROM cte WHERE id = 500000;
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 |
| 1 | TABLE ACCESS BY INDEX ROWID | FOO | 1 | 21 |
| * 2 | INDEX RANGE SCAN | FOO_ID_IX | 37 | 1 |
-----------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("FOO"."ID"=500000)
> WITH cte AS (
SELECT /*+ MATERIALIZE */ * FROM foo
)
SELECT * FROM cte WHERE id = 500000;
--------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9308 | 46 |
| 1 | TEMP TABLE TRANSFORMATION | | | |
| 2 | LOAD AS SELECT | SYS_TEMP | | |
| 3 | TABLE ACCESS FULL | FOO | 9308 | 22 |
| * 4 | VIEW | | 9308 | 24 |
| 5 | TABLE ACCESS FULL | SYS_TEMP | 9308 | 24 |
--------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 4 - filter("ID"=500000)

Predicate Push Down and CTE inlining

So should I stop using CTEs all together?

--

--

--

Full Stack Developer, Team Leader, Independent. More from me at https://hakibenita.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Moving to dev.to

The End is here!

Simple tips to improve your Cypress e2e testing

Intro Ruby Enumerable Methods

Why I hate code challenges

Apache BookKeeper Internals — Part 4 — Back Pressure

Python: Stop Using range() Blindly in For Loops

A Day in the Life: What it’s like to Intern at Connected Lab

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Haki Benita

Haki Benita

Full Stack Developer, Team Leader, Independent. More from me at https://hakibenita.com

More from Medium

Database Basics: Database Introduction, SQL vs NoSQL

Databases Overview — Thumbnail

Python’s urllib. request for HTTP Requests

Man How I Added Authentication to a Flask API With Decorators

Let’s Make it Async; Making asynchronous HTTP requests with “aiohttp” in Python