Description
Following @ardentperf's proposal in #126 to reorganize the container images, I have begun analyzing the current workflows to assess their feasibility. The primary objective is to align with the long-term vision of enabling CloudNativePG to operate with minimalistic images containing only PostgreSQL.
The current workflow reflects "scar tissue" development—residual patterns from an earlier phase. Initially, we adapted workflows developed at EDB for the closed-source operator and reused them with minimal adjustments when we open-sourced the project. This included adopting Docker Hub’s official PostgreSQL images. Over time, we made incremental changes as needed but never paused to reassess the overall structure comprehensively. This proposal aims to use this opportunity to introduce fundamental changes to how we build container images. Below is my initial proposal, intended to spark constructive discussions.
Distributions
- Continue using Debian
stable
while maintainingoldstable
.
Base Image
Transition from the official Docker Hub PostgreSQL image to images built on Debian Slim for Bookworm (12, stable) and Bullseye (11, oldstable). (Refer to Debian Releases.)
The primary reasons for this transition are:
- Official PostgreSQL images are designed to function outside Kubernetes as system containers, running as
root
(undesirable for us). - They include an entry point we do not require.
Instead, we need the following:
- Debian Slim as the base
- PostgreSQL APT repositories for package management
- PostgreSQL itself
- Cleanup to minimize image size
- Non-root user (user ID 26, consistent with the existing CNPG default from RH packages)
- No entry point
We should include multi-lang packages as well, given the expected reduction in size.
Flavours
- Minimal: Contains only PostgreSQL. Maintained by the core maintainers.
- Standard: Builds on the minimal flavour, adding Barman Cloud (until required) and extensions like
pgaudit
,pg_failover_slots
, andpgvector
. We should only use APT packages and stop building Barman from sources. Maintained by the core maintainers. - Full: Includes the minimal and standard components plus additional tools proposed in @ardentperf's proposal: minimal vs standard containers #126. Maintained by specific component owners (@ardentperf and ideally additional volunteers) under clear guidelines. This could contain also PostGIS.
Frequency
The original goal was a continuous delivery approach, rebuilding images only when underlying packages or the base image underwent significant changes. However, this intent was disrupted when we transitioned Barman Cloud from package-based installation to building from source using requirements.txt
as a checksum. This led to unnecessary daily image regenerations.
Monthly or bi-weekly builds are more than enough, with on-demand builds as necessary (for example, in the case of a new release of PostgreSQL).
Container Image Sequence Number
Previously, we increased a sequence number whenever a container image of a specific PostgreSQL version contained changes (e.g., postgresql:16.6-28-bookworm
, where 28
represents the 28th build for PostgreSQL 16.6 on Bookworm).
This approach stemmed from limitations in the earlier operator (pre-CloudNativePG), which lacked support for image digests like <image>:<tag>@sha256:<digestValue>
. As this limitation no longer exists, the sequence number is now redundant.
Proposal: Replace Sequence Number with Timestamps
We can transition to using timestamps to denote builds, aligning better with modern image-management practices and avoiding unnecessary complexity. Precision could be minutes (or seconds) in the UTC timezone.
Cognitive Load and Untestability of the Current Build Process
The current build process is overly complex and untestable locally. It relies on GitHub Actions to generate a convoluted matrix of combinations (which are actually predetermined) and to manage the sequence number. This complexity is unnecessary and can be replaced with a streamlined approach that can be run and tested locally.
We should aim for a single multi-stage Dockerfile that accepts PostgreSQL versions and Debian distributions as parameters. Additionally, we should explore open-source tools to assist with image building and related areas, such as generating SBOMs (Software Bill of Materials) and enriching OCI metadata, which have become more robust in recent years.
OCI Metadata
Currently, our images only include basic LABEL
s. We should adopt OCI annotations with the org.opencontainers.image
prefix to better align with industry standards.
SBOM
This is an excellent opportunity to introduce SBOMs into the build process, ensuring transparency and compliance with modern security practices.
Image Naming Schema
Our current naming schema is:
postgresql:<POSTGRES_VERSION>-<SEQUENCE>-<DEBIAN_VERSION_NAME>
.
We manage the following aliases:
latest
: Points to the highest sequence number for the latest patch of the highest PostgreSQL major version (currently 17) onbullseye
.MAJOR_VERSION
(e.g.,17
): Points to the highest sequence number for the latest patch of a specific PostgreSQL major version onbullseye
.POSTGRES_VERSION
(e.g.,17.2
): Points to the highest sequence number for a specific PostgreSQL minor version onbullseye
.MAJOR_VERSION
-DEBIAN_VERSION_NAME
(e.g.,17-bullseye
): Points to the latest patch for a specific PostgreSQL major version on a given Debian version.POSTGRES_VERSION
-DEBIAN_VERSION_NAME
(e.g.,17.2-bullseye
): Points to the latest patch for a specific PostgreSQL minor version on a given Debian version.
In practice, only the last two aliases are meaningful. The first three stem from "scar tissue" approaches dating back to when we had a single Debian version and used these containers to promote PostgreSQL in Kubernetes. These approaches are no longer necessary.
Proposed Schema
We propose a new naming schema:
postgresql:<POSTGRES_VERSION>-<FLAVOUR>-<DEBIAN_VERSION_NAME>-<TIMESTAMP_TO_MINUTE_IN_UTC>
.
The <FLAVOUR>
field could be one of:
minimal
standard
full
Under this schema, we eliminate aliases that omit critical details like flavour and Debian version, resulting in examples such as:
17-minimal-bookworm
: The latest minimal image on Bookworm with the most recent PostgreSQL 17 patch.17.2-minimal-bullseye
: The latest minimal image on Bullseye with the most recent PostgreSQL 17.2 patch.
Image Catalogs
Currently, image catalogs for PostgreSQL containers are built and stored in the Git repository. Given the direction to separate operands from extensions, it makes sense to suspend changes to the catalogs for now. Eventually, these catalogs should be moved to a separate repository to streamline development and maintenance.
Gradual Deprecation of Existing Images
To ensure a seamless transition for end users, we should implement a gradual deprecation strategy for the current operand images. This will minimise disruption while encouraging the adoption of the new image schema.
The most noticeable change will be the removal of the latest
alias, along with all other aliases that do not explicitly include:
- The name of the Debian distribution in the first stage.
- The flavour of the image.
Smoke Tests
We should incorporate smoke tests for each image built, ensuring they are tested with the latest stable version of CloudNativePG.
Summary
This document is likely not exhaustive, but it aims to provide a solid foundation for further discussions and planning.
Suggested Next Steps
-
Repository Structure: Decide whether to:
- Create a new branch within this repository, or
- Establish a new repository entirely (likely the better option, though naming it might prove challenging).
-
Testing and Artifact Management: Begin testing the proposed changes and push resulting artifacts to the
postgresql_testing
repository for review and validation.
These steps should guide us toward a more efficient and flexible container image build process. Let’s continue iterating on this as a community!
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status