Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Discussion options

Hi,

I will start to post here before opening an issue because I'm not sure if it's a bug or bad usage of the operator SDK.

Using Java operator SDK with Quarkus extension (Operator SDK 5.1.1 / Quarkus Java Operator SDK extension 7.2.0). So latest at time of writting

I'm writting my first operator. This is my setup (OKD). I got inpired from the mysql-schema sample https://github.com/operator-framework/java-operator-sdk/tree/main/sample-operators/mysql-schema

Basically it's a CR that create a service account on LDAP and save the secret on K8S

  • One CR
  • One Secret dependent
  • One external dependent that create the account and observe it (in case it's deleted)

My issue occurs only when my controller restart (The native container image, I cannot reproduce with quarkus dev, not sure why). The issue is that the operator tries to recreate the secret even if already exists

 - ldap-secret -> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://172.30.0.1:443/api/v1/namespaces/*****/secrets. Message: secrets "ldap-sa-****" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=secrets, name=ldap-sa***, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=secrets "ldap-****" already exists, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}).

Code snippet

@ControllerConfiguration(
        maxReconciliationInterval =
                @MaxReconciliationInterval(interval = 1, timeUnit = TimeUnit.HOURS),
        finalizerName = LDAPServiceAccount.FINALIZER)
@Workflow(
        dependents = {
            @Dependent(type = SecretDependent.class, name = "ldap-secret"),
            @Dependent(
                    type = LDAPServiceAccountResourceDependent.class,
                    name = "service-account",
                    dependsOn = "ldap-secret"),
        })
@RBACRule(apiGroups = LDAPServiceAccount.GROUP, resources = LDAPServiceAccount.PLURAL, verbs = RBACRule.ALL)
@RateLimited(maxReconciliations = 1, within = 30, unit = TimeUnit.SECONDS)
public class LDAPServiceAccountReconcilier extends AbstractConditionsReconcilier<LDAPServiceAccount, LDAPServiceAccountStatus> {

    @Override
    public UpdateControl<LDAPServiceAccount> reconcile(
            LDAPServiceAccount resource, Context<LDAPServiceAccount> context) {

        LDAPServiceAccountStatus status = resource.getStatus();
        ObjectMeta metadata = resource.getMetadata();
        LDAPServiceAccountSpec spec = resource.getSpec();

        StatusBuilder statusBuilder = new StatusBuilder();
        
        // shouldSkip skip if the status is done for the given generation
        if (shouldSkip(resource, status) || context.getSecondaryResource(LDAPServiceAccountResource.class).isEmpty()) {
            return UpdateControl.noUpdate();
        }

        statusBuilder.addDone(resource, "Reconciled");
        resource.setStatus(build(resource, statusBuilder));
        return UpdateControl.patchStatus(createForStatusUpdate(resource));
    }

    @Override
    protected LDAPServiceAccount createForStatusUpdate(LDAPServiceAccount resource) {
        LDAPServiceAccount copy = new LDAPServiceAccount();
        copy.setMetadata(new ObjectMetaBuilder()
                .withName(resource.getMetadata().getName())
                .withNamespace(resource.getMetadata().getNamespace())
                .build());
        copy.setStatus(resource.getStatus());
        return copy;
    }
@KubernetesDependent
public class SecretDependent extends KubernetesDependentResource<Secret, LDAPServiceAccount>
        implements Creator<Secret, LDAPServiceAccount>, GarbageCollected<LDAPServiceAccount> {

    ....

    public SecretDependent() {
        super(Secret.class);
    }

    @Override
    protected Secret desired(LDAPServiceAccount primary, Context<LDAPServiceAccount> context) {
        String groupName = primary.getSpec().group();
        String secretName = serviceAccountSecretName(primary);
        String serviceAccountName = ldapService.getServiceAccountName(LDAPService.GroupResponse.of(groupName));
        String password = ldapService.generatePassword();
        LOG.info(
                "Desired Secret with new random password '{}' for LDAP Service Account '{}'",
                secretName,
                serviceAccountName);
        LDAPService.ServiceAccountResponse serviceAccount =
                LDAPService.ServiceAccountResponse.of(serviceAccountName, LDAPService.DESCRIPTION);
        return kubernetesService.buildSecret(secretName, serviceAccount.email(), password, secretLabels(primary));
    }

    @Override
    public Matcher.Result<Secret> match(
            Secret actual, LDAPServiceAccount primary, Context<LDAPServiceAccount> context) {
        String secretName = serviceAccountSecretName(primary);
        LOG.info(
                "Matching Secret '{}' to LDAP Service Account '{}' in namespace '{}'",
                actual.getMetadata().getName(),
                primary.getMetadata().getName(),
                primary.getMetadata().getNamespace());
        String secretName = serviceAccountSecretName(primary);
        Matcher.Result<Secret> result =
                Matcher.Result.nonComputed(actual.getMetadata().getName().equals(secretName)
                        && actual.getMetadata()
                                .getNamespace()
                                .equals(primary.getMetadata().getNamespace()));
        return result;
    }

    private Map<String, String> secretLabels(LDAPServiceAccount resource) {
        return Map.of(
                "app.kubernetes.io/managed-by", "on-prem-operator",
                "app.kubernetes.io/instance", resource.getMetadata().getName(),
                "app.kubernetes.io/component", "ldap-service-account");
    }

    private String serviceAccountSecretName(LDAPServiceAccount resource) {
        return "ldap-sa-%s".formatted(resource.getMetadata().getName());
    }
}

I'm also adding some part of the external dependent but I don't think it's the issue since it depends on the secret and is able to observed the external resource without issue. And it's desired state matches it's observed state

public class LDAPServiceAccountResourceDependent
        extends PerResourcePollingDependentResource<LDAPServiceAccountResource, LDAPServiceAccount>
        implements Creator<LDAPServiceAccountResource, LDAPServiceAccount>, Deleter<LDAPServiceAccount> {

    public LDAPServiceAccountResourceDependent() {
        super(LDAPServiceAccountResource.class, Duration.ofHours(1));
    }

    @Override
    protected LDAPServiceAccountResource desired(LDAPServiceAccount primary, Context<LDAPServiceAccount> context) {
        String serviceAccountName = ldapService.getServiceAccountName(
                LDAPService.GroupResponse.of(primary.getSpec().group()));
        return new LDAPServiceAccountResource(serviceAccountName, LDAPService.DESCRIPTION);
    }

    @Override
    public void delete(LDAPServiceAccount primary, Context<LDAPServiceAccount> context) {
        // Not relevant
    }

    @Override
    public LDAPServiceAccountResource create(
            LDAPServiceAccountResource resource, LDAPServiceAccount primary, Context<LDAPServiceAccount> context) {
        // Not relevant
    }

    @Override
    public Set<LDAPServiceAccountResource> fetchResources(LDAPServiceAccount primary) {
        // Not relevant
    }
}

Before getting the 409 conflict I was able to capture some debug logs

2025-06-14 16:09:15,912 INFO  [ch.elc.ope.onp.con.lda.ser.dep.SecretDependent] (pool-23-thread-12) Desired Secret with new random password 'ldap-*****' for LDAP Service Account '****'
...
2025-06-14 16:09:15,912 DEBUG [io.jav.ope.pro.dep.kub.KubernetesDependentResource] (pool-23-thread-12) Creating target resource with type: class io.fabric8.kubernetes.api.model.Secret, with id: ResourceID{name='ldap-sa-***', namespace='****'}
...
2025-06-14 16:09:15,984 ERROR [ch.elc.ope.onp.con.AbstractConditionsReconcilier] (ReconcilerExecutor-ldapserviceaccountreconcilier-115) Error reconciling resource*****: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.AggregatedOperatorException: Exception(s) during workflow execution. Details:
 - ldap-secret -> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://172.30.0.1:443/api/v1/namespaces/****/secrets. Message: secrets "ldap-*****" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=secrets, name=ldap-sa-****, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=secrets "ldap-sa-****" already exists, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}).
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:507)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:524)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340)

I never see my Matching Secret *** logs. So I can imagine that the desired state doesn't match and the secret is created.

I basically observe the exact same for a Secret dependent that doesn't change (the desired state is always the same).

I cannot reproduce with 'quarkus dev' so I'm wondering if it's some "race condition" when the operator state on the Quarkus native image? After few seconds, all the resources become reconcilies, but I have always this exception when the operator restart, putting all CR in error when it should just skip the reconciliation since the state didn't changed between restart.

Or do you see anything wrong with my implementation?

Thanks for the help!

You must be logged in to vote

Replies: 3 comments · 4 replies

Comment options

Looking at the code

is looks the actual resource is null. This is suspicious to me

You must be logged in to vote
0 replies
Comment options

This happens because the informer does not see the secret in the cache. But I don't see anything suspicious. Maybe make sure you watch all the namespaces with the controller and the informer of KubernetesDependent.

You must be logged in to vote
4 replies
@jonesbusy
Comment options

Thanks. But why the secret is present on the next retry? I'm not doing any override at the informer level. It gets stable on the next retry.

This only happen on the controller startup.

I tried again to reproduce locally (kind and native image) but cannot. This only happen on my both OKD cluster (at least I have the same behavior on both).

Any hint to help me debug?

All controller are watching all namespaces (set from env vars)

QUARKUS_OPERATOR_SDK_CONTROLLERS_LDAPSERVICEACCOUNTRECONCILIER_NAMESPACES=JOSDK_ALL_NAMESPACES
@csviri
Comment options

that is very strange, on startup caches of informers should be synced before reconiliation.

@csviri
Comment options

Easiest to debug, and check if the informers cache is filled, you should be able to get them through context, on EventSourceManager

@csviri
Comment options

Pls let us know if you found something, but this is quite strange. If this would be an issue in SDK most of our tests wold have also an issue.

Comment options

Not yet. At least it confirm there is most likely nothing wrong on the code side (Maybe on my infra since I cannot reproduce locally, but also I dont' see why)

I will check how to debug the EventSourceManager and see what's inside

For the moment I found a very hugly workarround

    @Override
    protected Secret handleCreate(Secret desired, TechDocFolder primary, Context<TechDocFolder> context) {
        try {
            return super.handleCreate(desired, primary, context);
        } catch (KubernetesClientException e) {
            if (e.getCode() == 409) {
                LOG.warn(
                        "Received conflict error while creating secret: {}.",desired.getMetadata().getName());
                return desired;
            }
            throw e;
        }
    }

I see those logs only at startup. Main CR is also set with @MaxReconciliationInterval(interval = 15, timeUnit = TimeUnit.MINUTES). So issue after startup

I see those looks just after the controller is started (I have 2 with same design, they behave the same)

│ 2025-06-17 13:52:02,154 INFO  [io.jav.ope.pro.Controller] (Controller Starter for: techdocfolderreconcilier) 'techdocfolderreconcilier' controller started                                                    │
│ 2025-06-17 13:52:02,174 WARN  [ch.elc.ope.onp.con.min.tec.dep.SecretDependent] (pool-23-thread-13) Received conflict error while creating secret: s3-storage. This might be due to a previous failed reconcil │
│ iation.
...
...
│ 2025-06-17 13:52:02,377 INFO  [io.quarkus] (main) on-prem-operator 0.0.1-beta.25 native (powered by Quarkus 3.23.2) started in 1.025s. Listening on: http://0.0.0.0:8080                                      │
│ 2025-06-17 13:52:02,377 INFO  [io.quarkus] (main) Profile prod activated.                                                                                                                                     │
│ 2025-06-17 13:52:02,377 INFO  [io.quarkus] (main) Installed features: [cache, cdi, hibernate-validator, kubernetes, kubernetes-client, micrometer, minio-admin-client, minio-client, operator-sdk, oras-regis │
│ ry, rest, rest-client, rest-client-jackson, rest-jackson, resteasy-problem, security, security-ldap, smallrye-context-propagation, smallrye-health, vertx[]
You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.