Custom ENUM Type Columns and ORDER BY

by Christoph Schiessl on PostgreSQL

Custom ENUM types in PostgreSQL are an excellent tool for enforcing certain database constraints, but you must be careful if you use SELECT queries and want to ORDER BY these columns. Recently, I had to fix a bug whose root cause was a misunderstanding of this behavior. It's just a contrived example, but imagine a table of people with their marital status, which is implemented as a custom ENUM type.

postgres=# CREATE TYPE marital_status_type
postgres-#   AS ENUM ('single', 'married', 'divorced', 'other');
CREATE TYPE
postgres=# CREATE TABLE people (
postgres(#   id SERIAL PRIMARY KEY,
postgres(#   marital_status marital_status_type NOT NULL DEFAULT 'single'
postgres(# );
CREATE TABLE
postgres=# CREATE INDEX index_people_on_marital_status
postgres-#   ON people (marital_status);
CREATE INDEX
postgres=# \d people
                                      Table "public.people"
     Column     |        Type         | Collation | Nullable |              Default
----------------+---------------------+-----------+----------+------------------------------------
 id             | integer             |           | not null | nextval('people_id_seq'::regclass)
 marital_status | marital_status_type |           | not null | 'single'::marital_status_type
Indexes:
    "people_pkey" PRIMARY KEY, btree (id)
    "index_people_on_marital_status" btree (marital_status)

The ENUM defines all possible values for the marital status. If we try to INSERT a different value, then the INSERT statement fails, and we get an error.

postgres=# INSERT INTO people (marital_status)
postgres-#   VALUES ('single'), ('married'), ('divorced'), ('other');
INSERT 0 4
postgres=# INSERT INTO people (marital_status) VALUES ('invalid');
ERROR:  invalid input value for enum marital_status_type: "invalid"
LINE 1: INSERT INTO people (marital_status) VALUES ('invalid');
                                                    ^
postgres=# SELECT * FROM people ORDER BY id;
 id | marital_status
----+----------------
  1 | single
  2 | married
  3 | divorced
  4 | other
(4 rows)

Now, what will happen if we ORDER BY marital status? Let's try it out.

postgres=# SELECT * FROM people ORDER BY marital_status;
 id | marital_status
----+----------------
  1 | single
  2 | married
  3 | divorced
  4 | other
(4 rows)

As you can see, we get an ordering corresponding to the order in which the ENUM values have been defined. And this was the bug that I mentioned in the beginning. The code I was working with was broken because it incorrectly assumed alphabetical ordering. The minimal change needed to get alphabetical ordering is a simple typecast so that ENUM values are converted to TEXT before they are used in the ORDER BY clause:

postgres=# SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
 id | marital_status
----+----------------
  3 | divorced
  2 | married
  4 | other
  1 | single
(4 rows)

Unfortunately, this second SELECT query introduces a different problem. You can see it in the query plans of the two SELECT statements. Before doing so, I turn off sequential scans so the query planner doesn't mislead us.

postgres=# SET SESSION enable_seqscan = off;
SET
postgres=# EXPLAIN SELECT * FROM people ORDER BY marital_status;
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Index Scan using index_people_on_marital_status on people  (cost=0.15..82.06 rows=2260 width=8)
(1 row)

postgres=# EXPLAIN SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Sort  (cost=10000000169.81..10000000175.46 rows=2260 width=40)
   Sort Key: ((marital_status)::text)
   ->  Seq Scan on people  (cost=10000000000.00..10000000043.90 rows=2260 width=40)
 JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(6 rows)

So, the typecast makes it so our index on the marital_status column no longer applies. Depending on your application, this may be a big problem! My first reaction when I noticed this was to attempt to create a new index for the text-converted marital status:

postgres=# CREATE INDEX index_people_on_marital_status_text
postgres-#   ON people (CAST(marital_status AS TEXT));
ERROR:  functions in index expression must be marked IMMUTABLE

However, this did not work because the typecast to TEXT is mutable. The problem is that the values in the ENUM can be renamed, which would also change the output of the typecast. In the end, the typecast only returns the name of the ENUM value given to it in its definition. The solution for this problem is to create a custom conversion FUNCTION that is IMMUTABLE:

postgres=# CREATE FUNCTION marital_status_as_text(marital_status marital_status_type)
postgres-#   RETURNS TEXT LANGUAGE SQL IMMUTABLE AS $sql$ SELECT marital_status::TEXT; $sql$;
CREATE FUNCTION
postgres=# CREATE INDEX index_people_on_marital_status_text
postgres-#   ON people (marital_status_as_text(marital_status));
CREATE INDEX

With this FUNCTION, we tell PostgreSQL that our ENUM values will never be renamed. In any case, we can now finally SELECT people in alphabetical order according to their marital status while also using our index.

postgres=# EXPLAIN SELECT * FROM people ORDER BY marital_status_as_text(marital_status);
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Index Scan using index_people_on_marital_status_text on people  (cost=0.13..13.19 rows=4 width=40)
(1 row)

Note that this only works if we also use our custom conversion FUNCTION in our SELECT query because otherwise, the text-based index would not apply. If we use a simple typecast in the query, then the ordering still works, but the query planner has to fall back to sequential scanning.

postgres=# EXPLAIN SELECT * FROM people ORDER BY CAST(marital_status AS TEXT);
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Sort  (cost=10000000001.10..10000000001.11 rows=4 width=40)
   Sort Key: ((marital_status)::text)
   ->  Seq Scan on people  (cost=10000000000.00..10000000001.06 rows=4 width=40)
 JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(6 rows)

I hope you found this article helpful! Thank you very much for reading, and see you soon!

Christoph Schiessl

Hi, I'm Christoph Schiessl.

I help you build robust and fast Web Applications.


I'm available for hire as a freelance web developer, so you can take advantage of my more than a decade of experience working on many projects across several industries. Most of my clients are building web-based SaaS applications in a B2B context and depend on my expertise in various capacities.

More often than not, my involvement includes hands-on development work using technologies like Python, JavaScript, and PostgreSQL. Furthermore, if you already have an established team, I can support you as a technical product manager with a passion for simplifying complex processes. Lastly, I'm an avid writer and educator who takes pride in breaking technical concepts down into the simplest possible terms.

Continue Reading?

Here are a few more Articles for you ...


Repairing Corrupted Indexes with REINDEXing

This article outlines how to rebuild indexes with REINDEX. As an example, we will deliberately corrupt an index for a column that uses a custom ENUM column.

By Christoph Schiessl on PostgreSQL

Force Index Usage by Manipulating the Query Planner

Learn how to manipulate PostgreSQL's query planner to force it to use your indexes while working on optimizing the performance of your queries.

By Christoph Schiessl on PostgreSQL

Caching Expensive Queries with MATERIALIZED VIEWs

Learn how to use PostgreSQL's MATERIALIZED VIEWs to improve performance of complex queries. Persist query results and refresh them manually or automatically.

By Christoph Schiessl on PostgreSQL

Web App Reverse Checklist

Ready to Build Your Next Web App?

Get my Web App Reverse Checklist first ...


Software Engineering is often driven by fashion, but swimming with the current is rarely the best choice. In addition to knowing what to do, it's equally important to know what not to do. And this is precisely what my free Web App Reverse Checklist will help you with.

Subscribe below to get your free copy of my Reverse Checklist delivered to your inbox. Afterward, you can expect one weekly email on building resilient Web Applications using Python, JavaScript, and PostgreSQL.

By the way, it goes without saying that I'm not sharing your email address with anyone, and you're free to unsubscribe at any time. No spam. No commitments. No questions asked.