Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Kamaji control plane with K0s workers #476

florent1s started this conversation in General
Jun 18, 2024 · 3 comments · 7 replies
Discussion options

Hello there! I wanted to share my findings from the last months of working with Kubernetes and Kamaji, specifically regarding the combination of a Kamaji control plane with K0s workers.
TL;DR: It is possible and doesn't actually require that much tinkering (it took a lot of experimentation to get there however ;) ).
I don't know if some people tried this already as I couldn't find much information (if any) about this online so for those interested here are the steps to get this working, feel free to share any questions, thoughts, ideas or sugestions this inspires you. Also thanks @prometherion for your help on the issue I posted, the insight you shared proved quite helpful during my experimentations.

What you need

For reference, here are the versions I used:

  • K0s: 1.29.2+k0s.0
  • Kamaji: v0.4.2
  • Host cluster is on standard k8s running 1.28.3

Configuration files

You also need a few extra files to configure the tenant cluster and its control plane:

tenant control plane definition file

(and namespace definition if needed)
Remember to set spec.networkProfile.address to a valid IP
You also might want to adjust spec.kubernetes.version depending on what version your k0s is running (I don't know what impact it could have to mismatch those versions)
Notes:

  • The ports in spec.networkprofile and spec.addons.konnectivity.server need to be in the nodeport range (30000-32767). Don't make the same mistake I did ;)
  • The address in spec.networkprofile can be the IP of a worker node of the host cluster (what I used for testing, not robust but quick and easy), a virtualIP regouping those worker nodes or the IP of a loadbalancer which points to those nodes (ingress' IP is also an option, I am currently working on such a setup).

tenant-00.yaml

apiVersion: v1
kind: Namespace
metadata:
  labels:
    kubernetes.io/metadata.name: test
  name: test

---

apiVersion: kamaji.clastix.io/v1alpha1
kind: TenantControlPlane
metadata:
  name: tenant-00
  namespace: test
  labels:
    tenant.clastix.io: tenant-00
spec:
  dataStore: default
  controlPlane:
    deployment:
      replicas: 3
      additionalMetadata:
        labels:
          tenant.clastix.io: tenant-00
      extraArgs:
        apiServer: []
        controllerManager: []
        scheduler: []
      resources:
        apiServer:
          requests:
            cpu: 250m
            memory: 512Mi
          limits: {}
        controllerManager:
          requests:
            cpu: 125m
            memory: 256Mi
          limits: {}
        scheduler:
          requests:
            cpu: 125m
            memory: 256Mi
          limits: {}
    service:
      additionalMetadata:
        labels:
          tenant.clastix.io: tenant-00
      serviceType: NodePort
  kubernetes:
    version: v1.29.1
    kubelet:
      cgroupfs: systemd
      preferredAddressTypes:
      # default is Hostname first but that requires you to tell konnectivity daemonset 
      # how to resolve worker hostname (hostAlias for ex.) as CoreDNS doesn't know it
      - InternalIP  
      - Hostname
      - ExternalIP
    admissionControllers:
      - ResourceQuota
      - LimitRanger
  networkProfile:
    address: <worker_ip>
    port: 30001
    serviceCidr: 10.96.0.0/16
    podCidr: 10.36.0.0/16
    dnsServiceIPs:
    - 10.96.0.10
  addons:
    coreDNS: {}
    kubeProxy: {}
    konnectivity:
      server:
        port: 30002
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits: {}

default worker configuration configmap

Copied from a standard k0s cluster, only changed data.apiServerAddresses to match my configuration.

configmap-k0s-worker-default.yaml

apiVersion: v1
data:
  apiServerAddresses: '["<worker_ip>:30001"]'
  konnectivity: '{"enabled":true,"agentPort":8132}'
  kubeletConfiguration: '{"kind":"KubeletConfiguration","apiVersion":"kubelet.config.k8s.io/v1beta1","syncFrequency":"0s","fileCheckFrequency":"0s","httpCheckFrequency":"0s","tlsCipherSuites":["TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256","TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256"],"tlsMinVersion":"VersionTLS12","rotateCertificates":true,"serverTLSBootstrap":false,"authentication":{"x509":{},"webhook":{"cacheTTL":"0s"},"anonymous":{}},"authorization":{"webhook":{"cacheAuthorizedTTL":"0s","cacheUnauthorizedTTL":"0s"}},"eventRecordQPS":0,"clusterDomain":"cluster.local","clusterDNS":["10.96.0.10"],"streamingConnectionIdleTimeout":"0s","nodeStatusUpdateFrequency":"0s","nodeStatusReportFrequency":"0s","imageMinimumGCAge":"0s","imageMaximumGCAge":"0s","volumeStatsAggPeriod":"0s","cgroupsPerQOS":true,"cpuManagerReconcilePeriod":"0s","runtimeRequestTimeout":"0s","evictionPressureTransitionPeriod":"0s","failSwapOn":false,"memorySwap":{},"logging":{"flushFrequency":0,"verbosity":0,"options":{"json":{"infoBufferSize":"0"}}},"shutdownGracePeriod":"0s","shutdownGracePeriodCriticalPods":"0s","containerRuntimeEndpoint":""}'
  nodeLocalLoadBalancing: '{"type":"EnvoyProxy","envoyProxy":{"image":{"image":"quay.io/k0sproject/envoy-distroless","version":"v1.29.0"},"imagePullPolicy":"IfNotPresent","apiServerBindPort":7443,"konnectivityServerBindPort":7132}}'
  pauseImage: '{"image":"registry.k8s.io/pause","version":"3.9"}'
kind: ConfigMap
metadata:
  annotations:
    k0s.k0sproject.io/stack-checksum: f40b7da8b1b63621788bf8c3c2960f8f
  labels:
    app.kubernetes.io/component: worker-config
    app.kubernetes.io/managed-by: k0s
    app.kubernetes.io/name: k0s
    app.kubernetes.io/version: v1.29.2-k0s.0
    k0s.k0sproject.io/stack: k0s-worker-config
    k0s.k0sproject.io/worker-profile: default
  name: worker-config-default-1.29
  namespace: kube-system

Role and rolebinding

To allow the tenant workers to access the previously defined configmap

rb-configmap-access.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: system:bootstrappers:worker-config
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: system:bootstrappers:worker-config
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:nodes

---

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: system:bootstrappers:worker-config
  namespace: kube-system
rules:
- apiGroups:
  - ""
  resourceNames:
  - worker-config-default-1.29
#  - worker-config-default-windows-1.29
  resources:
  - configmaps
  verbs:
  - get
  - list
  - watch

How to install

Here is the script I wrote to create a basic TCP, apply the aforementioned configuration files and create the token for the k0s workers to join the cluster:
Remember to set SRVR_ADDR to a valid IP

setup-tenant.sh

# ===== DEFINE PARAMETERS =====

# API server location
SRVR_ADDR=<worker_ip>
SRVR_PORT=30001

# TCP namespace
NS=test

CLUSTER_NAME=tenant-00


# ===== SETUP TENANT CONTROLPLANE & CLUSTER =====

# setup TCP and nodeports
kubectl apply -f ${CLUSTER_NAME}.yaml

# wait for TCP to become ready
kubectl wait --timeout 60s --for=jsonpath='{.status.kubernetesResources.version.status}'=Ready tcp/${CLUSTER_NAME} -n $NS

# get kubeconfig file of tenant cluster
kubectl get secrets -n $NS ${CLUSTER_NAME}-admin-kubeconfig -o json \
  | jq -r '.data["admin.conf"]' \
  | base64 --decode \
  > ${CLUSTER_NAME}.kubeconfig

# wait for tenant cluster to be ready to receive requests
sleep 10

# add necessary ressources to tenant cluster
kubectl apply --kubeconfig ${CLUSTER_NAME}.kubeconfig -f rb-configmap-access.yaml,configmap-k0s-worker-default.yaml

# install Cilium CNI
helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium --version 1.15.5 \ 
  --namespace kube-system \ 
  --set securityContext.privileged=true \ 
  --kubeconfig ${CLUSTER_NAME}.kubeconfig


# ===== CREATE K0S TOKEN =====

# get base64 encoded CA data from kubeconfig
CA=$(sed '4!d' ${CLUSTER_NAME}.kubeconfig | cut -c 33-)

# create kubeadm join token
TOKEN_STR=$(kubeadm token --kubeconfig=${CLUSTER_NAME}.kubeconfig create --print-join-command)
TOKEN=$(echo ${TOKEN_STR#*--token} | cut -c -23)

# put all that into a specific kubeconfig file and compress it with gzip before encoding in base64
# most elements need to have this specific value or else the worker won't start
cat <<EOF | gzip | base64 > k0s-token
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: $CA
    server: https://$SRVR_ADDR:$SRVR_PORT
  name: k0s
contexts:
- context:
    cluster: k0s
    user: kubelet-bootstrap
  name: k0s
current-context: k0s
kind: Config
preferences: {}
users:
- name: kubelet-bootstrap
  user:
    token: $TOKEN
EOF

Join the worker

  • Copy k0s-token file to the workers
  • On said workers run k0s install worker --token-file k0s-token to start the k0s worker and make it join the cluster

Going further

I am still working on assembling a similar setup but with an ingress positionned in front of the host cluster and make all communications go through it. I found ways to make it work but I'm trying to eliminate some of the complexity, I will post an update once that's done (or when I'm stuck and don't know what to try anymore ;) ). I also plan on running the CNCF conformance tests against this setup to validate the cluster's behavior.

I also have a question reguarding the configuration options of the CoreDNS addon:

  • How/where can we configure it ? According to the API reference only imageRepository and imageTag can be set in TCP definition but it would be nice to be able to customize the Corefile to enable log/debug plugins or inject some host entries. I tried editing the coredns configmap but it gets reset instantly so I suppose Kamaji is controlling it.
    Turns out there is a recent discussion about this very thing: Enhancing CoreDNS configuration customisation #475
You must be logged in to vote

Replies: 3 comments · 7 replies

Comment options

Holy smoke! 🤯

Thanks for the advanced investigation you put in there, @florent1s!

Wondering what could be the next steps for this: maybe having it in a documentation file of the Kamaji website?

You must be logged in to vote
1 reply
@florent1s
Comment options

Thanks!
I think it would be useful to have it in the documentation, or at least mention the option with a link to this discussion to make it easier to find for others. If I can help to write that documentation I'd be glad to do so. For now, I will keep updating this discussion with the progress I make regarding the more advanced setups.

Comment options

Thank @florent1s! I agree @prometherion that could be integrated with the documentation, it's very clear. I'm wondering if k0sctl config could be leveraged to automate further some parts.

I know this can be achieved already with k0smotron and it's integrated with k0s. May I ask you which points @florent1s influenced your architectural decision to use Kamaji for managing control planes?

You must be logged in to vote
1 reply
@florent1s
Comment options

As far as I know you can not run k0sctl with a config file that doesn't contain a controller node, so that makes it harder to leverage for this specific usecase. But we plan on automating the setup process by integrating it into our GitOps pipeline once I get the desired setup with ingress working, we'll see how that goes.

We chose to try this with Kamaji because we don't want to be locked behind a single editor and also my internship supervisor already had some experience with Kamaji.

Comment options

The challenge and solution presented here are not exclusive to K0s and therefore also apply to standard K8s workers
so this guide may also help people who don't plan on using K0s.

Kamaji CP with K0s workers part 2: using ingress and FQDN to differenciate tenants

Hello! Here I am again with the follow-up to my previous post. It took me a while to write this because I had other priorities (internship report ;) ).

We finally have K0s workers connecting to a Kamaji control plane over an ingress (or rather multiple ingress')
with each tenant being identifiable by its FQDN which removes the need to assign a static IP per tenant.
The configuration required to get this to work is a bit more complicated than the one for the standard setup
I presented a few weeks ago so I will take the time to explain the setup and the challenges that come with it.
As previously stated feel free to share any ideas that come to your mind, if
you see alternative (maybe even better) ways to solve to challenges I will present here I would glady discuss them further.

Goals, challenges and experimentation

In the standard setup I described in my previous post you had to define an IP from the host cluster
(IP of a worker node of the host cluster for ex.) which would be the destination for all packets send to the API server.
While this works well enough it requires you to use a different pair of ports for every tenant or have a unique IP per tenant
(a virtual IP pointing to the control plane pods of the tenant or loadbalancers listening on the various IPs). That makes
this solution impractical as you either have to open 2 new ports for every tenant you create or have a lot of public IPs
available.

A possible solution could be to have an ingress setup in front of the Kamaji cluster that would redirect requests to the right
API server depending on the FQDN presented to it. By configuring a different FQDN for every tenant you can theoretically have an
infinite amount of tenants while only using a single IP and 2 ports, one for the API server and one for the Konnectivity server
(we will actually need a third port but more on that later).

This sounds quite simple so far right ? So where is the catch ?
Well problems arise once you discover that a K8s service can't point to an FQDN which means that the default Kubernetes service
in charge of redirecting trafic towards the API server requires an IP which is precisely what we wanted to avoid.

Our approach to resolve this was to deploy a proxy as a daemonset on the tenant workers which would listen on the destination IP
configured in the default service and redirect those packets to our ingress while using the desired FQDN as SNI in the requests.
And this is the part that turned out to be quite the headscratcher as I quickly realized that configuring both the proxy and the
ingress in a way that would make this work was not going to be that simple.

The problem being that the packets sent from the default service and redirected by the proxy are already using HTTPS, and you can't
just modify the SNI of such a packet as it is a part of the TLS handshake. To solve this you can give a certificate to the proxy which
allows it to terminate the TLS session of the arriving trafic before creating a new one with the destination ingress on the host cluster
(I tested this with a copy of the tenant cluster certificate which can be found in a secret on the host cluster). While this works it is
not what we wanted to do as this requires you to actively maintain the proxy to make sure the certificate is always valid. We therefore
tried to configure the proxy to do TLS-passthrough which worked fine if the incoming requests were using plain HTTP but not with the
HTTPS trafic coming from the cluster's components. I tried a lot of different combinations of both ingress and proxy configurations, the
only one worth mentioning because it actually could work once configured correctly consists in setting up the proxy and ingress to use the
proxy-protocol to transmit some information about the client trough the proxy and use that to differenciate tenants on the ingress. But I
didn't dig to deep into this one as we found another, probably simpler, way.

The proxy I used for my experimentations is HAProxy and the reason why it was so difficult to solve this issue was that I didn't understand
how HAProxy behaves when you configure it with TLS-passtrough while still telling it to change the SNI of the forwarded requets. When
configured to do TLS-passthrough and still set a different SNI for the forwarded requests HAProxy will open a new TLS session with the ingress
using the configured FQDN as SNI so we end up having TLS over TLS communications (see figure below). Once we identified this behavior we just
needed to configure our ingress to terminate the TLS connection with the proxy (the first TLS layer) before forwarding the content as-is (using
basic TCP) to the API server which would be able to deconstruct the second TLS layer. It turns out that Nginx, which is the ingress we used up to
that point, is unable to do this so we had to find an alternative. We decided to give the Kubernetes Gateway API a go because it allows you to
define precise instructions on how to deal with incoming trafic and how to forward it. We used it in combination with Traefik as it supports
many of the Gateway API's experimental features which we needed. The last small problem we faced is that all the other communications that
need to reach the API server are just regular HTTPS and not TLS over TLS, this is the case for the CLI tools (Kubectl, K9s), the CNI
(Cilium in our case) and the kubelet itself. The quick and easy fix was to add a third ingress dedicated to this trafic which would act as
standard TLS-passtrough with termination on the API server. This configuration ended up working pretty well and once everything is figured
out it isn't to complicated to setup either so we will probably stick with it.

tls-over-tls

The working setup

The working setup we ended up with works as follows:

  1. Pods (or other K8s entities) send trafic to the default service (SNI=default service IP)
  2. This service forwards the trafic to the proxy running on the host network
  3. Proxy initiates a new TLS session with ingress on host cluster, therefore encapsulating the received trafic (SNI=tenant FQDN)
  4. Ingress (Gateway from K8s Gateway API with Traefik) receives packets from proxy and terminates the TLS session, removing the first encrytpion layer before forwarding to the tenant CP pods (API server)
  5. API server receives the HTTPS trafic in same state as it was send by original client and can terminate the second TLS layer to read the actual content.

On top of that we also have 2 other ingresses (good old Nginx but could be done with Gateways for a more homogeneous setup), one for the communication between Konnectivity servers and agents and another one for the CLI tools, CNI and kubelet. Those simply forward all matching trafic to their corresponding server in TCP mode (TLS-passthrough).

kamaji-cluster

Step by step configuration

I will not share the entire configuration here as this post would be way too long but if someone is interested I could write a step by step guide with all relevant configurations just like for the standard setup in my previous post.

What's next

The next step on my side for this will be to make helm charts to deploy such a setup more easily before integrating those in our GitOps pipeline.

You must be logged in to vote
5 replies
@prometherion
Comment options

Bad I've missed this reply, so replying lately.

It's the same concept we're using for our Enterprise Addon, just a small note on our implementation.

  1. CNI agnostic, tested with Cilium and Calico
  2. (eventually) multiple Ingress support (right now, supported HAProxy)
  3. No need for multiple ports opening on a Tenant basis: everything's forwarded on the TLS one (443 generally speaking) and just relying on SNI routing, both for internal resolutions (worker nodes), CLI, and Konnectivity.
  4. Already GitOps compliant: it's just a matter of annotation on the Tenant Control Plane resources
@joseluisgonzalezca
Comment options

What's the reason why only HAproxy can be used as Ingress controller? Is there any limitation with annotations or ssl-passthrough?

@prometherion
Comment options

Our customers are relying on HAProxy, we can develop the flavour for other ingress controllers!

@joseluisgonzalezca
Comment options

Thanks for the quick reply! I was curious about the possible implications of using other controllers. I'm happy to see that other options can be considered in the future :)

@modzilla99
Comment options

Hey, great work you did there @joseluisgonzalezca. Just want to chime in there and say, that support for the GatewayAPI would be phenomenal. It's much more flexible than ingress and will definitely help with a vender neutral implementation without any controller specific annotations. Just for context, the gwapi spec supports TLS passthrough (SNI based routing) and TCP loadbalancing natively. Something like the envoy gateway controller is a good reference!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.