Portability via Static Linking of libpq

by Christoph Schiessl on DevOps, PostgreSQL, and Docker

The PostgreSQL packages shipping with Linux distributions generally depend on libldap. As it turns out, PostgreSQL has an obscure feature, where you can use LDAP for authentication. By itself, this is not a big deal, but unfortunately, libldap also pulls in many indirect dependencies. I have never encountered a project where this feature was used, so for the vast majority of PostgreSQL users, it should be safe to remove LDAP support.

Standard Package

We will be using Alpine 3.10.3 and Docker for our experiments. Let's start with Alpine's standard package to establish a baseline:

FROM alpine:3.10.3

# Install Alpine's official PostgreSQL package ...
RUN apk add --no-cache postgresql-dev

# We need this later on ...
RUN apk add --no-cache build-base
WORKDIR /work
VOLUME /work

ENTRYPOINT ["sh", "-c"]

Build the Docker image with docker build -t bugfactory/postgresql-standard ..

Custom Package

To get Alpine's package manager to build our customized package, we have to modify the APKBUILD file for the PostgreSQL package in the aports repository. Here is my patch:

diff --git a/main/postgresql/APKBUILD b/main/postgresql/APKBUILD
index b01c22cebf..f925add761 100644
--- a/main/postgresql/APKBUILD
+++ b/main/postgresql/APKBUILD
@@ -15,7 +15,7 @@ pkggroups="postgres"
 checkdepends="diffutils"
 depends_dev="openssl-dev"
 makedepends="$depends_dev libedit-dev zlib-dev libxml2-dev util-linux-dev
-   openldap-dev tcl-dev perl-dev python2-dev python3-dev"
+   tcl-dev perl-dev python2-dev python3-dev"
 subpackages="$pkgname-contrib $pkgname-dev $pkgname-doc libpq $pkgname-libs
    $pkgname-client $pkgname-pltcl
    $pkgname-plperl $pkgname-plperl-contrib:plperl_contrib
@@ -33,7 +33,7 @@ source="https://ftp.postgresql.org/pub/source/v$pkgver/$pkgname-$pkgver.tar.bz2
    pltcl_create_tables.sql
    "
 builddir="$srcdir/$pkgname-$pkgver"
-options="!checkroot"
+options="!checkroot !check"

 # secfixes:
 #   11.5-r0:
@@ -114,7 +114,7 @@ _configure() (
        --prefix=/usr \
        --mandir=/usr/share/man \
        --with-system-tzdata=/usr/share/zoneinfo \
-       --with-ldap \
+       --without-ldap \
        --with-libedit-preferred \
        --with-libxml \
        --with-openssl \

As you can see, it has only a few changes. The most significant one is the replacement of --with-ldap by --without-ldap. The other two are removing the build dependency on openldap-dev as well as disabling the check step after successful compilation. The corresponding Dockerfile looks like this (seems more complicated than it is):

FROM alpine:3.10.3

# Install Alpine's package SDK and generate a temporary key to sign our custom
# PostgreSQL package ...
RUN apk add --no-cache alpine-sdk && \
    addgroup root abuild && \
    abuild-keygen -n -a -i

ADD postgresql-without-ldap.patch .

RUN export REPO="https://github.com/alpinelinux/aports.git" && \
    # Clone the aports repository from GitHub and patch the APK build script to
    # remove LDAP support ...
    git clone --depth 1 --branch 3.10-stable --single-branch $REPO && \
    (cd aports && git apply ../postgresql-without-ldap.patch) && \
    # Compile PostgreSQL from source with our patched build sript ...
    (cd aports/main/postgresql && abuild -F deps && abuild -rF && abuild -F undeps) && \
    # Install our custom PostgreSQL version ...
    apk add --no-cache --repository /root/packages/main/ postgresql-dev && \
    # Remove files we no longer need ...
    rm -rf aports postgresql-without-ldap.patch /root/packages

# We need this later on ...
RUN apk add --no-cache build-base
WORKDIR /work
VOLUME /work

ENTRYPOINT ["sh", "-c"]

Build the Docker image with docker build -t bugfactory/postgresql-without-ldap .. This will take a few minutes because it has to compile PostgreSQL from source.

Comparison

$ docker run bugfactory/postgresql-standard "ldd /usr/lib/libpq.so"
    /lib/ld-musl-x86_64.so.1 (0x7f93b3911000)
    libssl.so.1.1 => /lib/libssl.so.1.1 (0x7f93b3847000)
    libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7f93b35ce000)
    libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x7f93b3585000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f93b3911000)
    liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x7f93b3577000)
    libsasl2.so.3 => /usr/lib/libsasl2.so.3 (0x7f93b355c000)
$ docker run bugfactory/postgresql-without-ldap "ldd /usr/lib/libpq.so"
    /lib/ld-musl-x86_64.so.1 (0x7fd7d68a4000)
    libssl.so.1.1 => /lib/libssl.so.1.1 (0x7fd7d67dc000)
    libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7fd7d6563000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7fd7d68a4000)

As you can see, the standard PostgreSQL package depends on libldap, whereas our version doesn't. Furthermore, it no longer requires all of the indirect dependencies that libldap pulled in.

Motivation

Dynamically linking libraries like libpq in your programs, makes them very brittle. Let's say you build your program in your CI system and copy the executable to a target system for deployment (e.g., web server). Would that work? Maybe. Firstly, the target system must have all the required libraries installed. Secondly, the installed versions of these libraries have to be compatible with the ones you used to build your program. In many cases, this is easier said than done.

One solution for this problem is to build statically linked programs. Let's look at a very simple C program as an example:

#include<stdio.h>
#include<stdlib.h>

int main() {
  printf("Hello World!");
  exit(EXIT_SUCCESS);
}

We can compile this program with dynamic or with static linking:

# Dynamic linking
$ docker run --volume $(pwd)/hello-world:/work \
         bugfactory/postgresql-standard \
         "gcc -o main main.c"
$ ldd hello-world/main
    linux-vdso.so.1 (0x00007ffdc95ee000)
    libc.musl-x86_64.so.1 => not found
# Static linking
$ docker run --volume $(pwd)/hello-world:/work \
         bugfactory/postgresql-without-ldap \
         "gcc -static -o main main.c"
$ ldd hello-world/main
    statically linked

Pay attention to this line: libc.musl-x86_64.so.1 => not found. The program compiled fine inside the Alpine container, but it's missing a library on my host system. This makes the program all but useless outside of the Alpine container. The takeaway here is that static linking makes your programs much more portable.

Interplay with libpq

Now imagine a slightly more complicated program with a dependency on libpq:

#include<stdlib.h>
#include "libpq-fe.h"

int main() {
  const char *conninfo = "dbname = postgres";
  PGconn *conn = PQconnectdb(conninfo);
  // Do something with conn ... doesn't matter for this example.
  exit(EXIT_SUCCESS);
}

If you link this program dynamically, you gain a lot of indirect dependencies in addition to libpq itself. Observe:

$ docker run --volume $(pwd)/hello-postgres:/work \
         bugfactory/postgresql-standard \
         "gcc -o main main.c -lpq && ldd main"
    /lib/ld-musl-x86_64.so.1 (0x7f7b399bf000)
    libpq.so.5 => /usr/lib/libpq.so.5 (0x7f7b3996f000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f7b399bf000)
    libssl.so.1.1 => /lib/libssl.so.1.1 (0x7f7b398f0000)
    libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7f7b39677000)
    libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x7f7b3962e000)
    liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x7f7b39620000)
    libsasl2.so.3 => /usr/lib/libsasl2.so.3 (0x7f7b39605000)

Now, let's try dynamic linking again, but this time with our custom version of PostgreSQL:

$ docker run --volume $(pwd)/hello-postgres:/work \
         bugfactory/postgresql-without-ldap \
         "gcc -o main main.c -lpq && ldd main"
    /lib/ld-musl-x86_64.so.1 (0x7f7efc69b000)
    libpq.so.5 => /usr/lib/libpq.so.5 (0x7f7efc64d000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f7efc69b000)
    libssl.so.1.1 => /lib/libssl.so.1.1 (0x7f7efc5ce000)
    libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7f7efc355000)

This results definitely in fewer dependencies, but we can do even better with static linking. The unfortunate reality is that static linking requires you to explicitly specify direct and indirect dependencies when you compile your program. Dynamic linking is a bit smarter because it automatically finds indirect dependencies. Needless to say, manually finding and keeping all of your programs' indirect dependencies up to date can be difficult. This is why I decided to remove PostgreSQL's LDAP dependency in the first place!

Let's first try static linking with Alpine's standard PostgreSQL package:

$ docker run --volume $(pwd)/hello-postgres:/work \
         bugfactory/postgresql-standard \
         "gcc -static -o main main.c -lpq"
# (very long output with many error messages)

When you try this, you get countless undefined reference errors from the linker because it can't find symbols like the ones defined by libssl, for instance. You could, of course, manually specify more and more libraries on the command line: -lpq, -lssl, and so on. However, as I pointed out before, this is a lot of monkey work because you have to specify all indirect dependencies too. You have to be aware of your program's whole dependency tree.

Now, let's try static linking again with our custom PostgreSQL package:

$ docker run --volume $(pwd)/hello-postgres:/work \
         bugfactory/postgresql-without-ldap \
         "gcc -static -o main main.c -lpq -lssl -lcrypto"
$ ldd hello-postgres/main
    statically linked

By eliminating the dependency on libldap (and all of its dependencies), we have been able to remove all but two indirect dependencies: libssl (-lssl) and libcrytpo (-lcrypto).

Conclusion

Static linking is a nice approach if you need to make your programs more portable. However, for programs with complex dependencies, this can be difficult to achieve with standard packages. That said, package managers like apk can be hacked to remove unneeded dependencies and thereby make the building of statically linked programs more manageable.

Ready to Learn More Web Development?

Join my Mailing List to receive two useful Articles per week.


I send two weekly emails on building performant and resilient Web Applications with Python, JavaScript and PostgreSQL. No spam. Unscubscribe at any time.

Continue Reading?

Here are a few more Articles for you ...


Why JavaScript’s undefined Isn’t What You Think It Is

In this informative article, you learn that undefined is not a keyword in JavaScript, and it's up to you to ensure it refers to the value its name suggests.

By Christoph Schiessl on JavaScript

Force Index Usage by Manipulating the Query Planner

Learn how to manipulate PostgreSQL's query planner to force it to use your indexes while working on optimizing the performance of your queries.

By Christoph Schiessl on PostgreSQL

Repairing Corrupted Indexes with REINDEXing

This article outlines how to rebuild indexes with REINDEX. As an example, we will deliberately corrupt an index for a column that uses a custom ENUM column.

By Christoph Schiessl on PostgreSQL

Christoph Schiessl

Christoph Schiessl

Independent Consultant + Full Stack Developer


If you hire me, you can rely on more than a decade of experience, which I have collected working on web applications for many clients across multiple industries. My involvement usually focuses on hands-on development work using various technologies like Python, JavaScript, PostgreSQL, or whichever technology we determine to be the best tool for the job. Furthermore, you can also depend on me in an advisory capacity to make educated technological choices for your backend and frontend teams. Lastly, I can help you transition to or improve your agile development processes.