libpq
by Christoph Schiessl on DevOps, PostgreSQL, and Docker
The PostgreSQL packages shipping with Linux distributions generally depend on libldap
. As it turns out, PostgreSQL has an obscure feature, where you can use LDAP for authentication. By itself, this is not a big deal, but unfortunately, libldap
also pulls in many indirect dependencies. I have never encountered a project where this feature was used, so for the vast majority of PostgreSQL users, it should be safe to remove LDAP support.
We will be using Alpine 3.10.3 and Docker for our experiments. Let's start with Alpine's standard package to establish a baseline:
FROM alpine:3.10.3
# Install Alpine's official PostgreSQL package ...
RUN apk add --no-cache postgresql-dev
# We need this later on ...
RUN apk add --no-cache build-base
WORKDIR /work
VOLUME /work
ENTRYPOINT ["sh", "-c"]
Build the Docker image with docker build -t bugfactory/postgresql-standard .
.
To get Alpine's package manager to build our customized package, we have to modify the APKBUILD
file for the PostgreSQL package in the aports
repository. Here is my patch:
diff --git a/main/postgresql/APKBUILD b/main/postgresql/APKBUILD
index b01c22cebf..f925add761 100644
--- a/main/postgresql/APKBUILD
+++ b/main/postgresql/APKBUILD
@@ -15,7 +15,7 @@ pkggroups="postgres"
checkdepends="diffutils"
depends_dev="openssl-dev"
makedepends="$depends_dev libedit-dev zlib-dev libxml2-dev util-linux-dev
- openldap-dev tcl-dev perl-dev python2-dev python3-dev"
+ tcl-dev perl-dev python2-dev python3-dev"
subpackages="$pkgname-contrib $pkgname-dev $pkgname-doc libpq $pkgname-libs
$pkgname-client $pkgname-pltcl
$pkgname-plperl $pkgname-plperl-contrib:plperl_contrib
@@ -33,7 +33,7 @@ source="https://ftp.postgresql.org/pub/source/v$pkgver/$pkgname-$pkgver.tar.bz2
pltcl_create_tables.sql
"
builddir="$srcdir/$pkgname-$pkgver"
-options="!checkroot"
+options="!checkroot !check"
# secfixes:
# 11.5-r0:
@@ -114,7 +114,7 @@ _configure() (
--prefix=/usr \
--mandir=/usr/share/man \
--with-system-tzdata=/usr/share/zoneinfo \
- --with-ldap \
+ --without-ldap \
--with-libedit-preferred \
--with-libxml \
--with-openssl \
As you can see, it has only a few changes. The most significant one is the replacement of --with-ldap
by --without-ldap
. The other two are removing the build dependency on openldap-dev
as well as disabling the check
step after successful compilation. The corresponding Dockerfile
looks like this (seems more complicated than it is):
FROM alpine:3.10.3
# Install Alpine's package SDK and generate a temporary key to sign our custom
# PostgreSQL package ...
RUN apk add --no-cache alpine-sdk && \
addgroup root abuild && \
abuild-keygen -n -a -i
ADD postgresql-without-ldap.patch .
RUN export REPO="https://github.com/alpinelinux/aports.git" && \
# Clone the aports repository from GitHub and patch the APK build script to
# remove LDAP support ...
git clone --depth 1 --branch 3.10-stable --single-branch $REPO && \
(cd aports && git apply ../postgresql-without-ldap.patch) && \
# Compile PostgreSQL from source with our patched build sript ...
(cd aports/main/postgresql && abuild -F deps && abuild -rF && abuild -F undeps) && \
# Install our custom PostgreSQL version ...
apk add --no-cache --repository /root/packages/main/ postgresql-dev && \
# Remove files we no longer need ...
rm -rf aports postgresql-without-ldap.patch /root/packages
# We need this later on ...
RUN apk add --no-cache build-base
WORKDIR /work
VOLUME /work
ENTRYPOINT ["sh", "-c"]
Build the Docker image with docker build -t bugfactory/postgresql-without-ldap .
. This will take a few minutes because it has to compile PostgreSQL from source.
$ docker run bugfactory/postgresql-standard "ldd /usr/lib/libpq.so"
/lib/ld-musl-x86_64.so.1 (0x7f93b3911000)
libssl.so.1.1 => /lib/libssl.so.1.1 (0x7f93b3847000)
libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7f93b35ce000)
libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x7f93b3585000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f93b3911000)
liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x7f93b3577000)
libsasl2.so.3 => /usr/lib/libsasl2.so.3 (0x7f93b355c000)
$ docker run bugfactory/postgresql-without-ldap "ldd /usr/lib/libpq.so"
/lib/ld-musl-x86_64.so.1 (0x7fd7d68a4000)
libssl.so.1.1 => /lib/libssl.so.1.1 (0x7fd7d67dc000)
libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7fd7d6563000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7fd7d68a4000)
As you can see, the standard PostgreSQL package depends on libldap
, whereas our version doesn't. Furthermore, it no longer requires all of the indirect dependencies that libldap
pulled in.
Dynamically linking libraries like libpq
in your programs, makes them very brittle. Let's say you build your program in your CI system and copy the executable to a target system for deployment (e.g., web server). Would that work? Maybe. Firstly, the target system must have all the required libraries installed. Secondly, the installed versions of these libraries have to be compatible with the ones you used to build your program. In many cases, this is easier said than done.
One solution for this problem is to build statically linked programs. Let's look at a very simple C program as an example:
#include<stdio.h>
#include<stdlib.h>
int main() {
printf("Hello World!");
exit(EXIT_SUCCESS);
}
We can compile this program with dynamic or with static linking:
# Dynamic linking
$ docker run --volume $(pwd)/hello-world:/work \
bugfactory/postgresql-standard \
"gcc -o main main.c"
$ ldd hello-world/main
linux-vdso.so.1 (0x00007ffdc95ee000)
libc.musl-x86_64.so.1 => not found
# Static linking
$ docker run --volume $(pwd)/hello-world:/work \
bugfactory/postgresql-without-ldap \
"gcc -static -o main main.c"
$ ldd hello-world/main
statically linked
Pay attention to this line: libc.musl-x86_64.so.1 => not found
. The program compiled fine inside the Alpine container, but it's missing a library on my host system. This makes the program all but useless outside of the Alpine container. The takeaway here is that static linking makes your programs much more portable.
libpq
Now imagine a slightly more complicated program with a dependency on libpq
:
#include<stdlib.h>
#include "libpq-fe.h"
int main() {
const char *conninfo = "dbname = postgres";
PGconn *conn = PQconnectdb(conninfo);
// Do something with conn ... doesn't matter for this example.
exit(EXIT_SUCCESS);
}
If you link this program dynamically, you gain a lot of indirect dependencies in addition to libpq
itself. Observe:
$ docker run --volume $(pwd)/hello-postgres:/work \
bugfactory/postgresql-standard \
"gcc -o main main.c -lpq && ldd main"
/lib/ld-musl-x86_64.so.1 (0x7f7b399bf000)
libpq.so.5 => /usr/lib/libpq.so.5 (0x7f7b3996f000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f7b399bf000)
libssl.so.1.1 => /lib/libssl.so.1.1 (0x7f7b398f0000)
libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7f7b39677000)
libldap_r-2.4.so.2 => /usr/lib/libldap_r-2.4.so.2 (0x7f7b3962e000)
liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x7f7b39620000)
libsasl2.so.3 => /usr/lib/libsasl2.so.3 (0x7f7b39605000)
Now, let's try dynamic linking again, but this time with our custom version of PostgreSQL:
$ docker run --volume $(pwd)/hello-postgres:/work \
bugfactory/postgresql-without-ldap \
"gcc -o main main.c -lpq && ldd main"
/lib/ld-musl-x86_64.so.1 (0x7f7efc69b000)
libpq.so.5 => /usr/lib/libpq.so.5 (0x7f7efc64d000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f7efc69b000)
libssl.so.1.1 => /lib/libssl.so.1.1 (0x7f7efc5ce000)
libcrypto.so.1.1 => /lib/libcrypto.so.1.1 (0x7f7efc355000)
This results definitely in fewer dependencies, but we can do even better with static linking. The unfortunate reality is that static linking requires you to explicitly specify direct and indirect dependencies when you compile your program. Dynamic linking is a bit smarter because it automatically finds indirect dependencies. Needless to say, manually finding and keeping all of your programs' indirect dependencies up to date can be difficult. This is why I decided to remove PostgreSQL's LDAP dependency in the first place!
Let's first try static linking with Alpine's standard PostgreSQL package:
$ docker run --volume $(pwd)/hello-postgres:/work \
bugfactory/postgresql-standard \
"gcc -static -o main main.c -lpq"
# (very long output with many error messages)
When you try this, you get countless undefined reference
errors from the linker because it can't find symbols like the ones defined by libssl
, for instance. You could, of course, manually specify more and more libraries on the command line: -lpq
, -lssl
, and so on. However, as I pointed out before, this is a lot of monkey work because you have to specify all indirect dependencies too. You have to be aware of your program's whole dependency tree.
Now, let's try static linking again with our custom PostgreSQL package:
$ docker run --volume $(pwd)/hello-postgres:/work \
bugfactory/postgresql-without-ldap \
"gcc -static -o main main.c -lpq -lssl -lcrypto"
$ ldd hello-postgres/main
statically linked
By eliminating the dependency on libldap
(and all of its dependencies), we have been able to remove all but two indirect dependencies: libssl
(-lssl
) and libcrytpo
(-lcrypto
).
Static linking is a nice approach if you need to make your programs more portable. However, for programs with complex dependencies, this can be difficult to achieve with standard packages. That said, package managers like apk
can be hacked to remove unneeded dependencies and thereby make the building of statically linked programs more manageable.
I send two weekly emails on building performant and resilient Web Applications with Python, JavaScript and PostgreSQL. No spam. Unscubscribe at any time.
undefined
Isn’t What You Think It Is
In this informative article, you learn that undefined
is not a keyword in JavaScript, and it's up to you to ensure it refers to the value its name suggests.
By Christoph Schiessl on JavaScript
Learn how to manipulate PostgreSQL's query planner to force it to use your indexes while working on optimizing the performance of your queries.
By Christoph Schiessl on PostgreSQL
REINDEX
ing
This article outlines how to rebuild indexes with REINDEX
. As an example, we will deliberately corrupt an index for a column that uses a custom ENUM
column.
By Christoph Schiessl on PostgreSQL