Custom ENUM Type Columns and ORDER BY

by Christoph Schiessl on PostgreSQL

Custom ENUM types in PostgreSQL are an excellent tool for enforcing certain database constraints, but you must be careful if you use SELECT queries and want to ORDER BY these columns. Recently, I had to fix a bug whose root cause was a misunderstanding of this behavior. It's just a contrived example, but imagine a table of people with their marital status, which is implemented as a custom ENUM type.

postgres=# CREATE TYPE marital_status_type
postgres-#   AS ENUM ('single', 'married', 'divorced', 'other');
CREATE TYPE
postgres=# CREATE TABLE people (
postgres(#   id SERIAL PRIMARY KEY,
postgres(#   marital_status marital_status_type NOT NULL DEFAULT 'single'
postgres(# );
CREATE TABLE
postgres=# CREATE INDEX index_people_on_marital_status
postgres-#   ON people (marital_status);
CREATE INDEX
postgres=# \d people
                                      Table "public.people"
     Column     |        Type         | Collation | Nullable |              Default
----------------+---------------------+-----------+----------+------------------------------------
 id             | integer             |           | not null | nextval('people_id_seq'::regclass)
 marital_status | marital_status_type |           | not null | 'single'::marital_status_type
Indexes:
    "people_pkey" PRIMARY KEY, btree (id)
    "index_people_on_marital_status" btree (marital_status)

The ENUM defines all possible values for the marital status. If we try to INSERT a different value, then the INSERT statement fails, and we get an error.

postgres=# INSERT INTO people (marital_status)
postgres-#   VALUES ('single'), ('married'), ('divorced'), ('other');
INSERT 0 4
postgres=# INSERT INTO people (marital_status) VALUES ('invalid');
ERROR:  invalid input value for enum marital_status_type: "invalid"
LINE 1: INSERT INTO people (marital_status) VALUES ('invalid');
                                                    ^
postgres=# SELECT * FROM people ORDER BY id;
 id | marital_status
----+----------------
  1 | single
  2 | married
  3 | divorced
  4 | other
(4 rows)

Now, what will happen if we ORDER BY marital status? Let's try it out.

postgres=# SELECT * FROM people ORDER BY marital_status;
 id | marital_status
----+----------------
  1 | single
  2 | married
  3 | divorced
  4 | other
(4 rows)

As you can see, we get an ordering corresponding to the order in which the ENUM values have been defined. And this was the bug that I mentioned in the beginning. The code I was working with was broken because it incorrectly assumed alphabetical ordering. The minimal change needed to get alphabetical ordering is a simple typecast so that ENUM values are converted to TEXT before they are used in the ORDER BY clause:

postgres=# SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
 id | marital_status
----+----------------
  3 | divorced
  2 | married
  4 | other
  1 | single
(4 rows)

Unfortunately, this second SELECT query introduces a different problem. You can see it in the query plans of the two SELECT statements. Before doing so, I turn off sequential scans so the query planner doesn't mislead us.

postgres=# SET SESSION enable_seqscan = off;
SET
postgres=# EXPLAIN SELECT * FROM people ORDER BY marital_status;
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Index Scan using index_people_on_marital_status on people  (cost=0.15..82.06 rows=2260 width=8)
(1 row)

postgres=# EXPLAIN SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Sort  (cost=10000000169.81..10000000175.46 rows=2260 width=40)
   Sort Key: ((marital_status)::text)
   ->  Seq Scan on people  (cost=10000000000.00..10000000043.90 rows=2260 width=40)
 JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(6 rows)

So, the typecast makes it so our index on the marital_status column no longer applies. Depending on your application, this may be a big problem! My first reaction when I noticed this was to attempt to create a new index for the text-converted marital status:

postgres=# CREATE INDEX index_people_on_marital_status_text
postgres-#   ON people (CAST(marital_status AS TEXT));
ERROR:  functions in index expression must be marked IMMUTABLE

However, this did not work because the typecast to TEXT is mutable. The problem is that the values in the ENUM can be renamed, which would also change the output of the typecast. In the end, the typecast only returns the name of the ENUM value given to it in its definition. The solution for this problem is to create a custom conversion FUNCTION that is IMMUTABLE:

postgres=# CREATE FUNCTION marital_status_as_text(marital_status marital_status_type)
postgres-#   RETURNS TEXT LANGUAGE SQL IMMUTABLE AS $sql$ SELECT marital_status::TEXT; $sql$;
CREATE FUNCTION
postgres=# CREATE INDEX index_people_on_marital_status_text
postgres-#   ON people (marital_status_as_text(marital_status));
CREATE INDEX

With this FUNCTION, we tell PostgreSQL that our ENUM values will never be renamed. In any case, we can now finally SELECT people in alphabetical order according to their marital status while also using our index.

postgres=# EXPLAIN SELECT * FROM people ORDER BY marital_status_as_text(marital_status);
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Index Scan using index_people_on_marital_status_text on people  (cost=0.13..13.19 rows=4 width=40)
(1 row)

Note that this only works if we also use our custom conversion FUNCTION in our SELECT query because otherwise, the text-based index would not apply. If we use a simple typecast in the query, then the ordering still works, but the query planner has to fall back to sequential scanning.

postgres=# EXPLAIN SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Sort  (cost=10000000001.10..10000000001.11 rows=4 width=40)
   Sort Key: ((marital_status)::text)
   ->  Seq Scan on people  (cost=10000000000.00..10000000001.06 rows=4 width=40)
 JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(6 rows)

I hope you found this article helpful! Thank you very much for reading, and see you soon!

Ready to Learn More Web Development?

Join my Mailing List to receive 1-2 useful Articles per week.


I send up to two weekly emails on building performant and resilient Web Apps with Python, JavaScript and PostgreSQL. No spam. Unscubscribe at any time.

Continue Reading?

Here are a few more Articles for you ...


Repairing Corrupted Indexes with REINDEXing

This article outlines how to rebuild indexes with REINDEX. As an example, we will deliberately corrupt an index for a column that uses a custom ENUM column.

By Christoph Schiessl on PostgreSQL

Force Index Usage by Manipulating the Query Planner

Learn how to manipulate PostgreSQL's query planner to force it to use your indexes while working on optimizing the performance of your queries.

By Christoph Schiessl on PostgreSQL

Caching Expensive Queries with MATERIALIZED VIEWs

Learn how to use PostgreSQL's MATERIALIZED VIEWs to improve performance of complex queries. Persist query results and refresh them manually or automatically.

By Christoph Schiessl on PostgreSQL

Christoph Schiessl

Christoph Schiessl

Independent Consultant + Full Stack Developer


If you hire me, you can rely on more than a decade of experience, which I have collected working on web applications for many clients across multiple industries. My involvement usually focuses on hands-on development work using various technologies like Python, JavaScript, PostgreSQL, or whichever technology we determine to be the best tool for the job. Furthermore, you can also depend on me in an advisory capacity to make educated technological choices for your backend and frontend teams. Lastly, I can help you transition to or improve your agile development processes.