Custom ENUM
Type Columns and ORDER BY
by Christoph Schiessl on PostgreSQL
Custom ENUM
types in PostgreSQL are an excellent tool for enforcing certain database constraints, but you must be careful if you use SELECT
queries and want to ORDER BY
these columns. Recently, I had to fix a bug whose root cause was a misunderstanding of this behavior. It's just a contrived example, but imagine a table of people with their marital status, which is implemented as a custom ENUM
type.
postgres=# CREATE TYPE marital_status_type
postgres-# AS ENUM ('single', 'married', 'divorced', 'other');
CREATE TYPE
postgres=# CREATE TABLE people (
postgres(# id SERIAL PRIMARY KEY,
postgres(# marital_status marital_status_type NOT NULL DEFAULT 'single'
postgres(# );
CREATE TABLE
postgres=# CREATE INDEX index_people_on_marital_status
postgres-# ON people (marital_status);
CREATE INDEX
postgres=# \d people
Table "public.people"
Column | Type | Collation | Nullable | Default
----------------+---------------------+-----------+----------+------------------------------------
id | integer | | not null | nextval('people_id_seq'::regclass)
marital_status | marital_status_type | | not null | 'single'::marital_status_type
Indexes:
"people_pkey" PRIMARY KEY, btree (id)
"index_people_on_marital_status" btree (marital_status)
The ENUM
defines all possible values for the marital status. If we try to INSERT
a different value, then the INSERT
statement fails, and we get an error.
postgres=# INSERT INTO people (marital_status)
postgres-# VALUES ('single'), ('married'), ('divorced'), ('other');
INSERT 0 4
postgres=# INSERT INTO people (marital_status) VALUES ('invalid');
ERROR: invalid input value for enum marital_status_type: "invalid"
LINE 1: INSERT INTO people (marital_status) VALUES ('invalid');
^
postgres=# SELECT * FROM people ORDER BY id;
id | marital_status
----+----------------
1 | single
2 | married
3 | divorced
4 | other
(4 rows)
Now, what will happen if we ORDER BY
marital status? Let's try it out.
postgres=# SELECT * FROM people ORDER BY marital_status;
id | marital_status
----+----------------
1 | single
2 | married
3 | divorced
4 | other
(4 rows)
As you can see, we get an ordering corresponding to the order in which the ENUM
values have been defined. And this was the bug that I mentioned in the beginning. The code I was working with was broken because it incorrectly assumed alphabetical ordering. The minimal change needed to get alphabetical ordering is a simple typecast so that ENUM
values are converted to TEXT
before they are used in the ORDER BY
clause:
postgres=# SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
id | marital_status
----+----------------
3 | divorced
2 | married
4 | other
1 | single
(4 rows)
Unfortunately, this second SELECT
query introduces a different problem. You can see it in the query plans of the two SELECT
statements. Before doing so, I turn off sequential scans so the query planner doesn't mislead us.
postgres=# SET SESSION enable_seqscan = off;
SET
postgres=# EXPLAIN SELECT * FROM people ORDER BY marital_status;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Index Scan using index_people_on_marital_status on people (cost=0.15..82.06 rows=2260 width=8)
(1 row)
postgres=# EXPLAIN SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
QUERY PLAN
------------------------------------------------------------------------------------
Sort (cost=10000000169.81..10000000175.46 rows=2260 width=40)
Sort Key: ((marital_status)::text)
-> Seq Scan on people (cost=10000000000.00..10000000043.90 rows=2260 width=40)
JIT:
Functions: 2
Options: Inlining true, Optimization true, Expressions true, Deforming true
(6 rows)
So, the typecast makes it so our index on the marital_status
column no longer applies. Depending on your application, this may be a big problem! My first reaction when I noticed this was to attempt to create a new index for the text-converted marital status:
postgres=# CREATE INDEX index_people_on_marital_status_text
postgres-# ON people (CAST(marital_status AS TEXT));
ERROR: functions in index expression must be marked IMMUTABLE
However, this did not work because the typecast to TEXT
is mutable. The problem is that the values in the ENUM
can be renamed, which would also change the output of the typecast. In the end, the typecast only returns the name of the ENUM
value given to it in its definition. The solution for this problem is to create a custom conversion FUNCTION
that is IMMUTABLE
:
postgres=# CREATE FUNCTION marital_status_as_text(marital_status marital_status_type)
postgres-# RETURNS TEXT LANGUAGE SQL IMMUTABLE AS $sql$ SELECT marital_status::TEXT; $sql$;
CREATE FUNCTION
postgres=# CREATE INDEX index_people_on_marital_status_text
postgres-# ON people (marital_status_as_text(marital_status));
CREATE INDEX
With this FUNCTION
, we tell PostgreSQL that our ENUM
values will never be renamed. In any case, we can now finally SELECT
people in alphabetical order according to their marital status while also using our index.
postgres=# EXPLAIN SELECT * FROM people ORDER BY marital_status_as_text(marital_status);
QUERY PLAN
----------------------------------------------------------------------------------------------------
Index Scan using index_people_on_marital_status_text on people (cost=0.13..13.19 rows=4 width=40)
(1 row)
Note that this only works if we also use our custom conversion FUNCTION
in our SELECT
query because otherwise, the text-based index would not apply. If we use a simple typecast in the query, then the ordering still works, but the query planner has to fall back to sequential scanning.
postgres=# EXPLAIN SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
QUERY PLAN
---------------------------------------------------------------------------------
Sort (cost=10000000001.10..10000000001.11 rows=4 width=40)
Sort Key: ((marital_status)::text)
-> Seq Scan on people (cost=10000000000.00..10000000001.06 rows=4 width=40)
JIT:
Functions: 2
Options: Inlining true, Optimization true, Expressions true, Deforming true
(6 rows)
I hope you found this article helpful! Thank you very much for reading, and see you soon!