Portability via Static Linking of libpq

by Christoph Schiessl on DevOps, PostgreSQL, and Docker

The PostgreSQL packages shipping with Linux distributions generally depend on libldap. As it turns out, PostgreSQL has an obscure feature, where you can use LDAP for authentication. By itself, this is not a big deal, but unfortunately, libldap also pulls in many indirect dependencies. I have never encountered a project in the wild using this feature, so for the vast majority of PostgreSQL users, it should be safe to remove LDAP support in favor of fewer dependencies and more portability.

Standard Package

We will be using Alpine 3.10 and Docker for our experiments. Let's start with Alpine's standard package to establish a baseline:

FROM alpine:3.10

# Install Alpine's official PostgreSQL package ...
RUN apk add --no-cache postgresql-dev

# We need this later on ...
RUN apk add --no-cache build-base
WORKDIR /work
VOLUME /work

ENTRYPOINT ["sh", "-c"]

Build the Docker image with docker build -t bugfactory/postgresql-standard ..

Custom Package

To get Alpine's package manager to build our customized package, we have to modify the APKBUILD file for the PostgreSQL package in the aports repository. Here is my patch:

diff --git a/main/postgresql/APKBUILD b/main/postgresql/APKBUILD
index 15d7f98..9bfefbd 100644
--- a/main/postgresql/APKBUILD
+++ b/main/postgresql/APKBUILD
@@ -15,7 +15,7 @@ pkggroups="postgres"
 checkdepends="diffutils"
 depends_dev="openssl-dev"
 makedepends="$depends_dev libedit-dev zlib-dev libxml2-dev util-linux-dev
-   openldap-dev tcl-dev perl-dev python2-dev python3-dev"
+   tcl-dev perl-dev python2-dev python3-dev"
 subpackages="$pkgname-contrib $pkgname-dev $pkgname-doc libpq $pkgname-libs
    $pkgname-client $pkgname-pltcl
    $pkgname-plperl $pkgname-plperl-contrib:plperl_contrib
@@ -129,7 +129,7 @@ _configure() (
        --prefix=/usr \
        --mandir=/usr/share/man \
        --with-system-tzdata=/usr/share/zoneinfo \
-       --with-ldap \
+       --without-ldap \
        --with-libedit-preferred \
        --with-libxml \
        --with-openssl \
@@ -140,14 +140,6 @@ _configure() (
        --with-tcl
 )

-check() {
-   cd "$builddir"
-
-   _run_tests src/test
-   _run_tests src/pl
-   _run_tests contrib
-}
-
 package() {
    cd "$builddir"

As you can see, it has only a few changes. The most significant one is the replacement of --with-ldap with --without-ldap. The other two remove the build dependency on openldap-dev and turn off the check step that normally runs the test suite after building. The corresponding Dockerfile looks like this (it seems more complicated than it is):

FROM alpine:3.10

# Install Alpine's package SDK and generate a temporary key to sign our custom
# PostgreSQL package ...
RUN apk add --no-cache alpine-sdk && \
    addgroup root abuild && \
    abuild-keygen -n --append --install

ADD postgresql-without-ldap.patch .

RUN export REPO="https://github.com/alpinelinux/aports.git" && \
    # Clone the aports repository from GitHub and patch the APK build script to
    # remove LDAP support ...
    git clone --depth 1 --branch 3.10-stable --single-branch $REPO && \
    (cd aports && git apply ../postgresql-without-ldap.patch) && \
    # Compile PostgreSQL from source using our patched build script ...
    (cd aports/main/postgresql && abuild -F deps && abuild -rF && abuild -F undeps) && \
    # Install our custom PostgreSQL version ...
    apk add --no-cache --repository /root/packages/main/ postgresql-dev && \
    # Remove files we no longer need ...
    rm -rf aports postgresql-without-ldap.patch /root/packages

# We need this later on ...
RUN apk add --no-cache build-base
WORKDIR /work
VOLUME /work

ENTRYPOINT ["sh", "-c"]

Build the Docker image with docker build -t bugfactory/postgresql-without-ldap .. This will take a few minutes because it has to compile PostgreSQL from source.

Comparison

$ docker run bugfactory/postgresql-standard "ldd /usr/lib/libpq.so"
    /lib/ld-musl-x86_64.so.1 (0x79a73484a000)
    libssl.so.1.1 => /lib/libssl.so.1.1 (0x79a734779000)
    libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x79a7344fb000)
    libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x79a7344b2000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x79a73484a000)
    liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x79a7344a4000)
    libsasl2.so.3 => /usr/lib/libsasl2.so.3 (0x79a734489000)
$ docker run bugfactory/postgresql-without-ldap "ldd /usr/lib/libpq.so"
    /lib/ld-musl-x86_64.so.1 (0x7af5ab3af000)
    libssl.so.1.1 => /lib/libssl.so.1.1 (0x7af5ab2df000)
    libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7af5ab061000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7af5ab3af000)

As you can see, the standard PostgreSQL package depends on libldap, whereas our version doesn't. Furthermore, it no longer requires all the indirect dependencies that libldap pulled in.

Motivation

Dynamically linking libraries like libpq in your programs makes them very brittle. Let's say you build your program in your CI system and copy the executable to a target system for deployment (e.g., web server). Would that work? Maybe. Firstly, the target system must have all the required libraries installed. Secondly, the installed versions of these libraries have to be compatible with the ones you used to build your program. In many cases, this is easier said than done.

One solution for this problem is to build statically linked programs. Let's look at a very simple C program as an example:

#include<stdio.h>
#include<stdlib.h>

int main() {
  printf("Hello World!");
  exit(EXIT_SUCCESS);
}

We can compile this program with dynamic or static linking:

# Dynamic linking
$ docker run --volume $(pwd)/hello-world:/work \
         bugfactory/postgresql-standard \
         "gcc -o main main.c"
$ ldd hello-world/main
    linux-vdso.so.1 (0x000076a4279b5000)
    libc.musl-x86_64.so.1 => not found
# Static linking
$ docker run --volume $(pwd)/hello-world:/work \
         bugfactory/postgresql-without-ldap \
         "gcc -static -o main main.c"
$ ldd hello-world/main
    statically linked

Pay attention to this line: libc.musl-x86_64.so.1 => not found. The program compiled fine inside the Alpine container, but it's missing a library on my host system. This makes the program all but useless outside of the Alpine container. The takeaway here is that static linking makes your programs much more portable.

Interplay with libpq

Now imagine a slightly more complicated program with a dependency on libpq:

#include<stdlib.h>
#include "libpq-fe.h"

int main() {
  const char *conninfo = "dbname = postgres";
  PGconn *conn = PQconnectdb(conninfo);
  // Do something with conn ... doesn't matter for this example.
  exit(EXIT_SUCCESS);
}

If you link this program dynamically, you gain a lot of indirect dependencies in addition to libpq itself. Observe:

$ docker run --volume $(pwd)/hello-postgres:/work \
         bugfactory/postgresql-standard \
         "gcc -o main main.c -lpq && ldd main"
    /lib/ld-musl-x86_64.so.1 (0x7c76c07ad000)
    libpq.so.5 => /usr/lib/libpq.so.5 (0x7c76c0757000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7c76c07ad000)
    libssl.so.1.1 => /lib/libssl.so.1.1 (0x7c76c06d7000)
    libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7c76c0459000)
    libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x7c76c0410000)
    liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x7c76c0402000)
    libsasl2.so.3 => /usr/lib/libsasl2.so.3 (0x7c76c03e7000)

Now, let's try dynamic linking again, but this time with our custom version of PostgreSQL:

$ docker run --volume $(pwd)/hello-postgres:/work \
         bugfactory/postgresql-without-ldap \
         "gcc -o main main.c -lpq && ldd main"
    /lib/ld-musl-x86_64.so.1 (0x70927b2fb000)
    libpq.so.5 => /usr/lib/libpq.so.5 (0x70927b2a6000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x70927b2fb000)
    libssl.so.1.1 => /lib/libssl.so.1.1 (0x70927b226000)
    libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x70927afa8000)

This results in fewer dependencies, but we can do even better with static linking. Unfortunately, static linking requires you to explicitly specify direct and indirect dependencies when you compile your program. Dynamic linking is a bit smarter because it automatically finds indirect dependencies. Needless to say, manually finding and keeping all of your programs' indirect dependencies up to date can be difficult. This is one of the reasons why I decided to remove PostgreSQL's LDAP dependency in the first place!

Let's first try static linking with Alpine's standard PostgreSQL package:

$ docker run --volume $(pwd)/hello-postgres:/work \
         bugfactory/postgresql-standard \
         "gcc -static -o main main.c -lpq"
# (very long output with many error messages)

When you try this, you get countless undefined reference errors from the linker because it can't find symbols like the ones defined by libssl, for instance. You could, of course, manually specify more and more libraries on the command line: -lpq, -lssl, and so on. However, as I pointed out before, this is a lot of monkey work because you must also specify all indirect dependencies. You have to be aware of your program's whole dependency tree.

Now, let's try static linking again with our custom PostgreSQL package:

$ docker run --volume $(pwd)/hello-postgres:/work \
         bugfactory/postgresql-without-ldap \
         "gcc -static -o main main.c -lpq -lssl -lcrypto"
$ ldd hello-postgres/main
    statically linked

By eliminating the dependency on libldap (and all of its dependencies), we have been able to remove all but two indirect dependencies: libssl (-lssl) and libcrytpo (-lcrypto).

Conclusion

Static linking is a nice approach to make your programs more portable. However, for programs with complex dependencies, this can be difficult to achieve with standard packages. That said, package managers like apk can be hacked to remove unneeded dependencies, thereby making the building of statically linked programs more manageable.

Web App Reverse Checklist

Ready to Build Your Next Web App?

Get my Web App Reverse Checklist first ...


Software Engineering is often driven by fashion, but swimming with the current is rarely the best choice. In addition to knowing what to do, it's equally important to know what not to do. And this is precisely what my free Web App Reverse Checklist will help you with.

Subscribe below to get your free copy of my Reverse Checklist delivered to your inbox. Afterward, you can expect one weekly email on building resilient Web Applications using Python, JavaScript, and PostgreSQL.

By the way, it goes without saying that I'm not sharing your email address with anyone, and you're free to unsubscribe at any time. No spam. No commitments. No questions asked.

Continue Reading?

Here are a few more Articles for you ...


How to <link> your Blog's Atom/RSS Feed from HTML Pages

Learn how to <link> Atom and RSS feeds from your HTML documents to make them discoverable for clients and, by extension, for your readers.

By Christoph Schiessl

Telling Docker Who You Are

Learn how to avoid permission issues when creating files on a Docker bind-mount volume from within a container and manage user IDs and group IDs on Linux.

By Christoph Schiessl on DevOps and Docker

Exploring Orphaned Branches to Understand Git's Internals

Learn about Git's internal data structure and how orphaned branches can be used to create separate histories with their own root commits.

By Christoph Schiessl on DevOps and Git

Christoph Schiessl

Hi, I'm Christoph Schiessl.

I help you build robust and fast Web Applications.


I'm available for hire as a freelance web developer, so you can take advantage of my more than a decade of experience working on many projects across several industries. Most of my clients are building web-based SaaS applications in a B2B context and depend on my expertise in various capacities.

More often than not, my involvement includes hands-on development work using technologies like Python, JavaScript, and PostgreSQL. Furthermore, if you already have an established team, I can support you as a technical product manager with a passion for simplifying complex processes. Lastly, I'm an avid writer and educator who takes pride in breaking technical concepts down into the simplest possible terms.