Portability via Static Linking of libpq
by Christoph Schiessl on DevOps, PostgreSQL, and Docker
The PostgreSQL packages shipping with Linux distributions generally depend on libldap
. As it turns out, PostgreSQL has an obscure feature, where you can use LDAP for authentication. By itself, this is not a big deal, but unfortunately, libldap
also pulls in many indirect dependencies. I have never encountered a project in the wild using this feature, so for the vast majority of PostgreSQL users, it should be safe to remove LDAP support in favor of fewer dependencies and more portability.
Standard Package
We will be using Alpine 3.10 and Docker for our experiments. Let's start with Alpine's standard package to establish a baseline:
FROM alpine:3.10
# Install Alpine's official PostgreSQL package ...
RUN apk add --no-cache postgresql-dev
# We need this later on ...
RUN apk add --no-cache build-base
WORKDIR /work
VOLUME /work
ENTRYPOINT ["sh", "-c"]
Build the Docker image with docker build -t bugfactory/postgresql-standard .
.
Custom Package
To get Alpine's package manager to build our customized package, we have to modify the APKBUILD
file for the PostgreSQL package in the aports
repository. Here is my patch:
diff --git a/main/postgresql/APKBUILD b/main/postgresql/APKBUILD
index 15d7f98..9bfefbd 100644
--- a/main/postgresql/APKBUILD
+++ b/main/postgresql/APKBUILD
@@ -15,7 +15,7 @@ pkggroups="postgres"
checkdepends="diffutils"
depends_dev="openssl-dev"
makedepends="$depends_dev libedit-dev zlib-dev libxml2-dev util-linux-dev
- openldap-dev tcl-dev perl-dev python2-dev python3-dev"
+ tcl-dev perl-dev python2-dev python3-dev"
subpackages="$pkgname-contrib $pkgname-dev $pkgname-doc libpq $pkgname-libs
$pkgname-client $pkgname-pltcl
$pkgname-plperl $pkgname-plperl-contrib:plperl_contrib
@@ -129,7 +129,7 @@ _configure() (
--prefix=/usr \
--mandir=/usr/share/man \
--with-system-tzdata=/usr/share/zoneinfo \
- --with-ldap \
+ --without-ldap \
--with-libedit-preferred \
--with-libxml \
--with-openssl \
@@ -140,14 +140,6 @@ _configure() (
--with-tcl
)
-check() {
- cd "$builddir"
-
- _run_tests src/test
- _run_tests src/pl
- _run_tests contrib
-}
-
package() {
cd "$builddir"
As you can see, it has only a few changes. The most significant one is the replacement of --with-ldap
with --without-ldap
. The other two remove the build dependency on openldap-dev
and turn off the check
step that normally runs the test suite after building. The corresponding Dockerfile
looks like this (it seems more complicated than it is):
FROM alpine:3.10
# Install Alpine's package SDK and generate a temporary key to sign our custom
# PostgreSQL package ...
RUN apk add --no-cache alpine-sdk && \
addgroup root abuild && \
abuild-keygen -n --append --install
ADD postgresql-without-ldap.patch .
RUN export REPO="https://github.com/alpinelinux/aports.git" && \
# Clone the aports repository from GitHub and patch the APK build script to
# remove LDAP support ...
git clone --depth 1 --branch 3.10-stable --single-branch $REPO && \
(cd aports && git apply ../postgresql-without-ldap.patch) && \
# Compile PostgreSQL from source using our patched build script ...
(cd aports/main/postgresql && abuild -F deps && abuild -rF && abuild -F undeps) && \
# Install our custom PostgreSQL version ...
apk add --no-cache --repository /root/packages/main/ postgresql-dev && \
# Remove files we no longer need ...
rm -rf aports postgresql-without-ldap.patch /root/packages
# We need this later on ...
RUN apk add --no-cache build-base
WORKDIR /work
VOLUME /work
ENTRYPOINT ["sh", "-c"]
Build the Docker image with docker build -t bugfactory/postgresql-without-ldap .
. This will take a few minutes because it has to compile PostgreSQL from source.
Comparison
$ docker run bugfactory/postgresql-standard "ldd /usr/lib/libpq.so"
/lib/ld-musl-x86_64.so.1 (0x79a73484a000)
libssl.so.1.1 => /lib/libssl.so.1.1 (0x79a734779000)
libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x79a7344fb000)
libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x79a7344b2000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x79a73484a000)
liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x79a7344a4000)
libsasl2.so.3 => /usr/lib/libsasl2.so.3 (0x79a734489000)
$ docker run bugfactory/postgresql-without-ldap "ldd /usr/lib/libpq.so"
/lib/ld-musl-x86_64.so.1 (0x7af5ab3af000)
libssl.so.1.1 => /lib/libssl.so.1.1 (0x7af5ab2df000)
libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7af5ab061000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7af5ab3af000)
As you can see, the standard PostgreSQL package depends on libldap
, whereas our version doesn't. Furthermore, it no longer requires all the indirect dependencies that libldap
pulled in.
Motivation
Dynamically linking libraries like libpq
in your programs makes them very brittle. Let's say you build your program in your CI system and copy the executable to a target system for deployment (e.g., web server). Would that work? Maybe. Firstly, the target system must have all the required libraries installed. Secondly, the installed versions of these libraries have to be compatible with the ones you used to build your program. In many cases, this is easier said than done.
One solution for this problem is to build statically linked programs. Let's look at a very simple C program as an example:
#include<stdio.h>
#include<stdlib.h>
int main() {
printf("Hello World!");
exit(EXIT_SUCCESS);
}
We can compile this program with dynamic or static linking:
# Dynamic linking
$ docker run --volume $(pwd)/hello-world:/work \
bugfactory/postgresql-standard \
"gcc -o main main.c"
$ ldd hello-world/main
linux-vdso.so.1 (0x000076a4279b5000)
libc.musl-x86_64.so.1 => not found
# Static linking
$ docker run --volume $(pwd)/hello-world:/work \
bugfactory/postgresql-without-ldap \
"gcc -static -o main main.c"
$ ldd hello-world/main
statically linked
Pay attention to this line: libc.musl-x86_64.so.1 => not found
. The program compiled fine inside the Alpine container, but it's missing a library on my host system. This makes the program all but useless outside of the Alpine container. The takeaway here is that static linking makes your programs much more portable.
Interplay with libpq
Now imagine a slightly more complicated program with a dependency on libpq
:
#include<stdlib.h>
#include "libpq-fe.h"
int main() {
const char *conninfo = "dbname = postgres";
PGconn *conn = PQconnectdb(conninfo);
// Do something with conn ... doesn't matter for this example.
exit(EXIT_SUCCESS);
}
If you link this program dynamically, you gain a lot of indirect dependencies in addition to libpq
itself. Observe:
$ docker run --volume $(pwd)/hello-postgres:/work \
bugfactory/postgresql-standard \
"gcc -o main main.c -lpq && ldd main"
/lib/ld-musl-x86_64.so.1 (0x7c76c07ad000)
libpq.so.5 => /usr/lib/libpq.so.5 (0x7c76c0757000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7c76c07ad000)
libssl.so.1.1 => /lib/libssl.so.1.1 (0x7c76c06d7000)
libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7c76c0459000)
libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x7c76c0410000)
liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x7c76c0402000)
libsasl2.so.3 => /usr/lib/libsasl2.so.3 (0x7c76c03e7000)
Now, let's try dynamic linking again, but this time with our custom version of PostgreSQL:
$ docker run --volume $(pwd)/hello-postgres:/work \
bugfactory/postgresql-without-ldap \
"gcc -o main main.c -lpq && ldd main"
/lib/ld-musl-x86_64.so.1 (0x70927b2fb000)
libpq.so.5 => /usr/lib/libpq.so.5 (0x70927b2a6000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x70927b2fb000)
libssl.so.1.1 => /lib/libssl.so.1.1 (0x70927b226000)
libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x70927afa8000)
This results in fewer dependencies, but we can do even better with static linking. Unfortunately, static linking requires you to explicitly specify direct and indirect dependencies when you compile your program. Dynamic linking is a bit smarter because it automatically finds indirect dependencies. Needless to say, manually finding and keeping all of your programs' indirect dependencies up to date can be difficult. This is one of the reasons why I decided to remove PostgreSQL's LDAP dependency in the first place!
Let's first try static linking with Alpine's standard PostgreSQL package:
$ docker run --volume $(pwd)/hello-postgres:/work \
bugfactory/postgresql-standard \
"gcc -static -o main main.c -lpq"
# (very long output with many error messages)
When you try this, you get countless undefined reference
errors from the linker because it can't find symbols like the ones defined by libssl
, for instance. You could, of course, manually specify more and more libraries on the command line: -lpq
, -lssl
, and so on. However, as I pointed out before, this is a lot of monkey work because you must also specify all indirect dependencies. You have to be aware of your program's whole dependency tree.
Now, let's try static linking again with our custom PostgreSQL package:
$ docker run --volume $(pwd)/hello-postgres:/work \
bugfactory/postgresql-without-ldap \
"gcc -static -o main main.c -lpq -lssl -lcrypto"
$ ldd hello-postgres/main
statically linked
By eliminating the dependency on libldap
(and all of its dependencies), we have been able to remove all but two indirect dependencies: libssl
(-lssl
) and libcrytpo
(-lcrypto
).
Conclusion
Static linking is a nice approach to make your programs more portable. However, for programs with complex dependencies, this can be difficult to achieve with standard packages. That said, package managers like apk
can be hacked to remove unneeded dependencies, thereby making the building of statically linked programs more manageable.