Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[velero] Add velero image build with race condition fix patch#1711

Open
kvaps wants to merge 1 commit intocozystack:maincozystack/cozystack:mainfrom
kvaps:velero-image-buildkvaps/cozystack:velero-image-buildCopy head branch name to clipboard
Open

[velero] Add velero image build with race condition fix patch#1711
kvaps wants to merge 1 commit intocozystack:maincozystack/cozystack:mainfrom
kvaps:velero-image-buildkvaps/cozystack:velero-image-buildCopy head branch name to clipboard

Conversation

@kvaps
Copy link
Member

@kvaps kvaps commented Dec 10, 2025

What this PR does

Adds velero image build infrastructure similar to kamaji, including:

Changes:

  • Added packages/system/velero/images/velero/Dockerfile - builds velero binary with patch applied
  • Added packages/system/velero/images/velero/patches/9447.diff - race condition fix patch
  • Updated packages/system/velero/Makefile - added image target and improved update target
  • Updated packages/system/velero/values.yaml - added image configuration
  • Updated main Makefile - added velero to build targets

The patch fixes a race condition where multiple tasks waste time on expensive GetExposed operations when the concurrent limit is already reached. The fix checks the concurrent limit before performing the expensive operation.

Release note

[velero] Add custom velero image build with race condition fix patch

Summary by CodeRabbit

  • New Features

    • Added automated Velero image build and a new multi-stage Velero runtime image for releases.
    • Made metrics fields (interval/timeout) optional and added relabeling placeholders for metrics configuration.
  • Bug Fixes

    • Fixed a race condition in concurrent task handling to prevent dropped operations under load.
  • Chores

    • Bumped Velero to v1.17.1 and chart to 11.2.0; updated related plugin tags and monitoring alerts.

✏️ Tip: You can customize this high-level summary in your review settings.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Dec 10, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 10, 2025

Walkthrough

Adds Velero image build targets and Dockerfile, updates Helm chart version and values (making some monitoring fields optional), and introduces a datapath reservation pre-check to avoid concurrent-task races in multiple reconcilers.

Changes

Cohort / File(s) Summary
Top-level build
Makefile, packages/system/velero/Makefile
Added make -C packages/system/velero image to the main build. New image target in Velero Makefile builds the Velero image with docker buildx, updates chart values with repository and digest, and includes shared env include.
Helm chart metadata
packages/system/velero/charts/velero/Chart.yaml
Bumped appVersion 1.17.0 → 1.17.1 and version 11.0.0 → 11.2.0.
Helm templates (monitoring)
packages/system/velero/charts/velero/templates/podmonitor.yaml, packages/system/velero/charts/velero/templates/servicemonitor.yaml
Converted unconditional metric fields (interval, scrapeTimeout, scheme, tlsConfig) and blocks (metricRelabelings, relabelings) into conditional with blocks and use toYaml for optional rendering.
Helm values and schema
packages/system/velero/charts/velero/values.schema.json, packages/system/velero/charts/velero/values.yaml, packages/system/velero/values.yaml
Removed scrapeInterval and scrapeTimeout from schema metrics.required. Updated example/image tags to v1.17.1 and plugin v1.13.1, added metricRelabelings/relabelings placeholders and alert wording/timing adjustments. Added explicit velero.image block with repository, tag (digest), and pullPolicy.
Container image build artifacts
packages/system/velero/images/velero/Dockerfile, packages/system/velero/images/velero/patches/*
Added multi-stage Dockerfile building velero, velero-restore-helper, and velero-helper from source (golang builder → tiny runtime). Includes patch application in build context.
Datapath reservation / reconciler changes
packages/system/velero/images/velero/patches/9447.diff (applies to pkg/datapath/manager.go and several reconcilers)
Adds reservation support and public methods on Manager: CanAcceptNewTask(resume bool), ReserveSlot(taskName string), CompleteReservation(taskName string), ReleaseReservation(taskName string); reconcilers now ReserveSlot before expensive GetExposed calls, requeue if limit reached, and release/complete reservations on error/success.

Sequence Diagram(s)

sequenceDiagram
  participant Reconciler
  participant Manager
  participant ControllerOps as GetExposed/BRWatcher

  Reconciler->>Manager: ReserveSlot(taskName)
  alt Manager accepts
    Manager-->>Reconciler: ok
    Reconciler->>ControllerOps: GetExposed(...)
    alt success -> BRWatcher created
      ControllerOps-->>Manager: BRWatcher created (implies reservation completed)
      Manager-->>Reconciler: reservation completed
    else error
      ControllerOps-->>Manager: error
      Manager-->>Reconciler: ReleaseReservation(taskName)
      Reconciler-->>Reconciler: requeue after 5s or handle error
    end
  else Manager rejects (concurrent limit)
    Manager-->>Reconciler: limit reached
    Reconciler-->>Reconciler: log & requeue after 5s
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

  • Verify datapath Manager methods (reservation map concurrency, correctness of CanAcceptNewTask semantics, race conditions, and lock protection).
  • Review reconciler changes for correct reservation lifecycle (ReserveSlot, ReleaseReservation, CompleteReservation) on all error/success paths.
  • Validate Helm template conditional rendering and indentation to avoid YAML generation issues.
  • Check Dockerfile build flags, artifact copying, and chart values update logic that extracts image digest.

Poem

🐰 I dug a build and found a crate,
I stitched the image, sealed its fate.
Slots reserved so races cease,
Charts uplifted, metrics at peace.
Hop—new tags, new builds, a tiny feast 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main purpose of the PR: adding a Velero image build infrastructure with a race condition fix patch, which aligns with the substantial changes across Dockerfile, patch file, Makefiles, and configuration files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kvaps, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust build pipeline for a custom Velero image, primarily to incorporate a critical race condition fix that optimizes data path operations by preventing unnecessary resource-intensive calls when concurrent limits are met. Alongside this, the Velero Helm chart is updated to its latest patch version, and its monitoring configurations are enhanced for greater flexibility and control over Prometheus scraping.

Highlights

  • Custom Velero Image Build: New infrastructure, including a Dockerfile and Makefile targets, has been added to build a custom Velero image.
  • Race Condition Fix Integration: A critical patch (from PR #9447) is now integrated into the custom Velero image, resolving a race condition in data path controllers by checking concurrent limits before executing expensive operations.
  • Velero Version Update: The Velero Helm chart's appVersion has been updated from 1.17.0 to 1.17.1, and the chart version from 11.0.0 to 11.2.0.
  • Monitoring Configuration Enhancements: The PodMonitor and ServiceMonitor templates have been refined to conditionally apply scrape intervals/timeouts and to support additional metricRelabelings, relabelings, and tlsConfig.
  • Main Build Process Integration: The custom Velero image build process has been integrated into the main project's Makefile.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@kvaps kvaps force-pushed the velero-image-build branch from 41e6854 to 780367f Compare December 10, 2025 21:04
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the infrastructure to build a custom Velero image, which includes a critical patch to fix a race condition. The changes include a new Dockerfile, updates to the Makefiles for building the image, and modifications to the Velero Helm chart.

My review has identified a few issues:

  • Critical: The Helm templates for ServiceMonitor and PodMonitor contain a formatting error that will produce invalid YAML, causing deployments to fail.
  • High: The Dockerfile for building Velero uses || true when applying the patch, which could silently ignore patch failures and result in an image without the intended fix.
  • Medium: The Makefile target for the image build contains redundant commands that can be consolidated for better efficiency and readability.

I have provided specific suggestions to address these points.

I am having trouble creating individual review comments. Click here to see my feedback.

packages/system/velero/charts/velero/templates/podmonitor.yaml (44-52)

critical

The current formatting for metricRelabelings, relabelings, and tlsConfig will result in invalid YAML because the toYaml output is placed on the same line as the key. This will cause Helm installation to fail. You should place the toYaml call on a new line to ensure correct YAML structure.

    {{- with .Values.metrics.nodeAgentPodMonitor.metricRelabelings }}
    metricRelabelings:
      {{- toYaml . | nindent 6 }}
    {{- end }}
    {{- with .Values.metrics.nodeAgentPodMonitor.relabelings }}
    relabelings:
      {{- toYaml . | nindent 6 }}
    {{- end }}
    {{- with .Values.metrics.nodeAgentPodMonitor.tlsConfig }}
    tlsConfig:
      {{- toYaml . | nindent 6 }}
    {{- end }}

packages/system/velero/charts/velero/templates/servicemonitor.yaml (40-48)

critical

The current formatting for metricRelabelings, relabelings, and tlsConfig will result in invalid YAML because the toYaml output is placed on the same line as the key. This will cause Helm installation to fail. You should place the toYaml call on a new line to ensure correct YAML structure.

    {{- with .Values.metrics.serviceMonitor.metricRelabelings }}
    metricRelabelings:
      {{- toYaml . | nindent 6 }}
    {{- end }}
    {{- with .Values.metrics.serviceMonitor.relabelings }}
    relabelings:
      {{- toYaml . | nindent 6 }}
    {{- end }}
    {{- with .Values.metrics.serviceMonitor.tlsConfig }}
    tlsConfig:
      {{- toYaml . | nindent 6 }}
    {{- end }}

packages/system/velero/images/velero/Dockerfile (22)

high

Using || true with git apply suppresses errors, which means the build will succeed even if the patch fails to apply. This is risky as it could result in an image that doesn't contain the critical race condition fix. It's safer to remove || true and let the build fail if the patch cannot be applied. This ensures the integrity of the build and guarantees the fix is included.

RUN git apply /patches/*.diff

packages/system/velero/Makefile (23-30)

medium

These commands are inefficient as they read and write to values.yaml four times and calculate the image digest twice. You can combine these into a single, more efficient command that calculates the digest once and updates the YAML file in one pass. This improves both performance and readability.

REPOSITORY="$(REGISTRY)/velero"; \
TAG_WITH_DIGEST=$(TAG)@$$(yq e '."containerimage.digest"' images/velero.json -o json -r); \
yq -i "(.velero.image.repository = strenv(REPOSITORY)) | (.velero.image.tag = strenv(TAG_WITH_DIGEST)) | (.velero.charts.velero.values.image.repository = strenv(REPOSITORY)) | (.velero.charts.velero.values.image.tag = strenv(TAG_WITH_DIGEST))" values.yaml

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
packages/system/velero/images/velero/Dockerfile (3)

19-19: Consider verifying source tarball integrity.

Downloading source code without checksum verification could allow supply chain attacks if GitHub is compromised or MITM'd. Consider adding a SHA256 checksum verification step for the downloaded tarball.


26-26: GIT_SHA is computed but never used.

GIT_SHA is extracted but not included in the -ldflags. Either remove the unused variable or add it to the build info (e.g., buildinfo.GitSHA).

-    GIT_SHA=$(git rev-parse HEAD 2>/dev/null || echo "unknown") && \
     go build -o /output/velero \

40-40: Pin the base image to a specific digest for reproducibility.

Using :latest tag for the runtime base image makes builds non-reproducible. Consider pinning to a specific version or digest.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10d3574 and 780367f.

📒 Files selected for processing (10)
  • Makefile (1 hunks)
  • packages/system/velero/Makefile (2 hunks)
  • packages/system/velero/charts/velero/Chart.yaml (2 hunks)
  • packages/system/velero/charts/velero/templates/podmonitor.yaml (1 hunks)
  • packages/system/velero/charts/velero/templates/servicemonitor.yaml (1 hunks)
  • packages/system/velero/charts/velero/values.schema.json (0 hunks)
  • packages/system/velero/charts/velero/values.yaml (5 hunks)
  • packages/system/velero/images/velero/Dockerfile (1 hunks)
  • packages/system/velero/images/velero/patches/9447.diff (1 hunks)
  • packages/system/velero/values.yaml (1 hunks)
💤 Files with no reviewable changes (1)
  • packages/system/velero/charts/velero/values.schema.json
🧰 Additional context used
📓 Path-based instructions (2)
**/Chart.yaml

📄 CodeRabbit inference engine (AGENTS.md)

Use Helm Charts with the umbrella pattern and vendor upstream charts in charts/ directory

Files:

  • packages/system/velero/charts/velero/Chart.yaml
**/charts/**

📄 CodeRabbit inference engine (AGENTS.md)

Do not directly modify upstream charts; use vendoring approach instead

Files:

  • packages/system/velero/charts/velero/Chart.yaml
  • packages/system/velero/charts/velero/values.yaml
  • packages/system/velero/charts/velero/templates/podmonitor.yaml
  • packages/system/velero/charts/velero/templates/servicemonitor.yaml
🧠 Learnings (1)
📚 Learning: 2025-11-27T11:21:45.808Z
Learnt from: CR
Repo: cozystack/cozystack PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T11:21:45.808Z
Learning: Applies to **/Chart.yaml : Use Helm Charts with the umbrella pattern and vendor upstream charts in `charts/` directory

Applied to files:

  • packages/system/velero/charts/velero/Chart.yaml
🪛 checkmake (0.2.2)
packages/system/velero/Makefile

[warning] 16-16: Target body for "image" exceeds allowed length of 5 (11).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build
🔇 Additional comments (9)
Makefile (1)

26-26: LGTM!

The Velero image build target follows the established pattern used by other packages in the build sequence.

packages/system/velero/charts/velero/Chart.yaml (1)

1-17: Verify this aligns with the vendoring approach for upstream charts.

Per coding guidelines, upstream charts in charts/ should not be directly modified. If this is a vendored upstream Velero chart, consider whether these version changes should come from re-vendoring the upstream chart rather than direct modification.

If this is intentionally a patched/custom chart for cozystack, disregard this comment.

Based on coding guidelines: "Do not directly modify upstream charts; use vendoring approach instead."

packages/system/velero/values.yaml (1)

2-5: Image configuration with digest pinning looks good.

The SHA256 digest ensures reproducibility despite the latest prefix. The IfNotPresent pull policy is appropriate when using digest-pinned images.

packages/system/velero/images/velero/Dockerfile (1)

2-2: The golang:1.24-bookworm base image is available on Docker Hub and is a valid choice. No action needed.

packages/system/velero/images/velero/patches/9447.diff (1)

1-106: Patch is applicable and correctly addresses an unmerged upstream fix.

This patch applies PR #9447 to Velero v1.17.1, which is currently open and not yet merged upstream. The fix properly addresses the race condition by adding a CanAcceptNewTask check before expensive GetExposed operations. The implementation correctly uses mutex locking, and the 5-second requeue delay is reasonable. The patch is needed since this fix is not yet included in the v1.17.1 release.

packages/system/velero/Makefile (2)

13-14: Potential circular dependency: values.yaml tag → Dockerfile ARG VERSION.

The update target extracts the image tag from values.yaml and injects it into the Dockerfile as ARG VERSION. This creates a dependency where values.yaml is the source of truth for the build version. However, line 26 in the image target modifies values.yaml with the built image digest, creating a circular update pattern.

Verify that this flow is intentional and that the Dockerfile ARG VERSION is only used during the build (not for subsequent chart deployments), otherwise synchronization issues could arise.


17-22: Verify that required make variables are defined.

The image target uses $(REGISTRY), $(call settag,$(TAG)), and $(BUILDX_ARGS) without defining them locally. Ensure these are exported from ../../../scripts/common-envs.mk (included on line 4). If not defined, the build will fail silently or use empty values.

packages/system/velero/charts/velero/templates/podmonitor.yaml (1)

35-52: ✅ Proper conditional rendering of optional metrics fields.

The refactoring correctly makes interval, scrapeTimeout, scheme, metricRelabelings, relabelings, and tlsConfig optional by using Helm with blocks. This prevents nil-value errors and aligns with the values.yaml updates that make these fields optional in the schema.

The indentation with nindent is correct for nested YAML serialization.

packages/system/velero/charts/velero/templates/servicemonitor.yaml (1)

31-48: ✅ Consistent conditional rendering across ServiceMonitor and PodMonitor.

The refactoring mirrors the changes in podmonitor.yaml with proper conditional blocks for optional fields. The value sourcing is correctly differentiated:

  • interval and scrapeTimeout from shared .Values.metrics (applied to both monitors)
  • scheme, metricRelabelings, relabelings, tlsConfig from monitor-specific .Values.metrics.serviceMonitor

Proper YAML indentation with nindent is maintained throughout.

image:
repository: velero/velero
tag: v1.17.0
tag: v1.17.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find Makefile in velero package
find packages/system/velero -name "Makefile" -o -name "makefile"

Repository: cozystack/cozystack

Length of output: 94


🏁 Script executed:

# Check directory structure of velero package
ls -la packages/system/velero/

Repository: cozystack/cozystack

Length of output: 472


🏁 Script executed:

# Look for update targets or helm pull commands
grep -r "helm pull" packages/system/velero/ --include="Makefile" --include="*.mk" 2>/dev/null || echo "No helm pull found"

Repository: cozystack/cozystack

Length of output: 144


🏁 Script executed:

# Check for overlay or patch mechanisms
find packages/system/velero -type f \( -name "*overlay*" -o -name "*patch*" -o -name "kustomization*" \) 2>/dev/null

Repository: cozystack/cozystack

Length of output: 45


🏁 Script executed:

# Read the full Makefile to understand workflow
cat -n packages/system/velero/Makefile

Repository: cozystack/cozystack

Length of output: 1172


🏁 Script executed:

# Check what's in the charts directory after helm pull
ls -la packages/system/velero/charts/

Repository: cozystack/cozystack

Length of output: 226


🏁 Script executed:

# Check if there's a values.yaml in the vendored chart
ls -la packages/system/velero/charts/velero/ | head -20

Repository: cozystack/cozystack

Length of output: 715


🏁 Script executed:

# Read the root values.yaml to understand its purpose
cat -n packages/system/velero/values.yaml

Repository: cozystack/cozystack

Length of output: 993


🏁 Script executed:

# Verify the size and line count difference
wc -l packages/system/velero/values.yaml packages/system/velero/charts/velero/values.yaml

Repository: cozystack/cozystack

Length of output: 170


🏁 Script executed:

# Confirm the root values.yaml is a wrapper/override file by checking if it has the full chart structure
head -30 packages/system/velero/charts/velero/values.yaml

Repository: cozystack/cozystack

Length of output: 963


🏁 Script executed:

# Check if charts directory is in .gitignore
cat packages/system/velero/.gitignore 2>/dev/null || echo "No .gitignore in velero directory"

Repository: cozystack/cozystack

Length of output: 96


🏁 Script executed:

# Check git status of charts directory
git -C packages/system/velero ls-files charts/ 2>/dev/null | head -5 || echo "Not checking git"

Repository: cozystack/cozystack

Length of output: 192


🏁 Script executed:

# Verify the actual line 30 content from the root values.yaml
sed -n '20,30p' packages/system/velero/values.yaml

Repository: cozystack/cozystack

Length of output: 249


🏁 Script executed:

# Check git to see if charts/ is tracked
git ls-files packages/system/velero/charts/ | head -10

Repository: cozystack/cozystack

Length of output: 618


🏁 Script executed:

# Check if charts directory should be in gitignore by looking at parent directories
cat packages/system/.gitignore 2>/dev/null || echo "No .gitignore at packages/system level"

Repository: cozystack/cozystack

Length of output: 101


🏁 Script executed:

# Check root level gitignore
grep -i "charts" .gitignore 2>/dev/null || echo "No charts entry in root .gitignore"

Repository: cozystack/cozystack

Length of output: 97


🏁 Script executed:

# Verify the actual content at line 30 of the vendored chart values.yaml
sed -n '25,35p' packages/system/velero/charts/velero/values.yaml

Repository: cozystack/cozystack

Length of output: 478


🏁 Script executed:

# Check git log to see if charts/velero was intentionally added
git log --oneline packages/system/velero/charts/velero/values.yaml 2>/dev/null | head -5

Repository: cozystack/cozystack

Length of output: 133


🏁 Script executed:

# Check if there's a pattern in other packages for how they handle vendored charts
find packages -name "Makefile" -exec grep -l "helm pull" {} \; | head -5

Repository: cozystack/cozystack

Length of output: 1269


🏁 Script executed:

# Look at another package to see if they have similar structure
ls packages/system/*/charts/ 2>/dev/null | head -10

Repository: cozystack/cozystack

Length of output: 257


🏁 Script executed:

# Verify the comment's claim about line 133 plugin tag
sed -n '128,138p' packages/system/velero/charts/velero/values.yaml

Repository: cozystack/cozystack

Length of output: 446


🏁 Script executed:

# Check if charts/ appears in any .gitignore files
find . -name ".gitignore" -exec grep -l "charts" {} \; 2>/dev/null

Repository: cozystack/cozystack

Length of output: 124


🏁 Script executed:

# Look at cert-manager package to see if it has similar pattern
cat packages/system/cert-manager/Makefile | head -20

Repository: cozystack/cozystack

Length of output: 315


🏁 Script executed:

# Verify the learnings apply - check AGENTS.md
cat AGENTS.md | grep -A 5 -B 5 "charts" 2>/dev/null | head -20

Repository: cozystack/cozystack

Length of output: 931


Add charts/ directory to .gitignore to prevent committing vendored upstream charts.

The charts/velero/ directory is a vendored upstream chart pulled from tanzu/velero via make update (which runs helm pull). Per the codebase conventions (AGENTS.md), upstream charts should not be edited directly. However, this directory is currently tracked in version control, which creates confusion about which modifications are intentional overrides versus vendored content.

The proper pattern—using a wrapper values.yaml with custom overrides—is already correctly implemented in the Makefile (lines 24–26 modify the root values.yaml). To complete this setup:

  1. Add packages/system/velero/charts/ to .gitignore (or create one)
  2. Ensure all manual customizations live in the root values.yaml wrapper, not in the vendored chart files

This ensures the vendored chart remains pristine and can be re-pulled cleanly without version control conflicts.

🤖 Prompt for AI Agents
In packages/system/velero/charts/velero/values.yaml around line 30, the vendored
upstream chart is currently tracked; add the charts/ directory to .gitignore
(e.g., add packages/system/velero/charts/) so the vendored/chart files aren’t
committed, and move any intentional custom overrides into the root wrapper
values.yaml per the Makefile flow; ensure you remove/unstage the tracked charts/
files from git (git rm --cached) after updating .gitignore so future make
update/helm pull operations won’t create VCS noise.

RUN curl -sSL https://github.com/vmware-tanzu/velero/archive/refs/tags/${VERSION}.tar.gz | tar -xzvf- --strip=1

COPY patches /patches
RUN git apply /patches/*.diff || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Silently ignoring patch failures is dangerous.

The || true suppresses patch application failures, meaning the build will succeed even if the race condition fix patch fails to apply. This defeats the purpose of this PR.

-RUN git apply /patches/*.diff || true
+RUN git apply /patches/*.diff
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN git apply /patches/*.diff || true
RUN git apply /patches/*.diff
🤖 Prompt for AI Agents
In packages/system/velero/images/velero/Dockerfile around line 22, the RUN
command uses "git apply /patches/*.diff || true" which silences failures; remove
the "|| true" so the build fails on patch-apply errors, and instead guard the
command so it only runs when patch files exist (e.g., check for /patches/*.diff
presence before running git apply) so missing patches don't cause a false
failure; ensure the Dockerfile exits non-zero if git apply returns an error to
prevent a successful build when a patch fails.

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@kvaps kvaps force-pushed the velero-image-build branch from 780367f to a7af320 Compare December 11, 2025 09:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
packages/system/velero/images/velero/patches/9447.diff (1)

202-303: Core reservation logic is sound and thread-safe.

The reservation mechanism correctly prevents the race condition:

  • Properly tracks both active tasks (tracker) and reserved slots (reservations)
  • Thread-safe with consistent mutex usage
  • Concurrent limit correctly enforced: (len(m.tracker) + len(m.reservations)) >= m.cocurrentNum
  • No memory leaks - all reservation cleanup paths are covered

Minor observation: CanAcceptNewTask (lines 250-262) and CompleteReservation (lines 288-295) are defined but not called anywhere in this patch. Additionally, CreateMicroServiceBRWatcher directly deletes from reservations (line 240) rather than calling CompleteReservation(). These unused methods may be for API completeness or future use.

If CanAcceptNewTask and CompleteReservation are not intended for future use, consider removing them to reduce unused code. Alternatively, CreateMicroServiceBRWatcher could be refactored to call CompleteReservation() for consistency:

 m.tracker[taskName] = MicroServiceBRWatcherCreator(client, kubeClient, mgr, taskType, taskName, namespace, podName, containerName, associatedObject, callbacks, log)
 
-// Release reservation if it exists (it will be replaced by actual AsyncBR in tracker)
-delete(m.reservations, taskName)
+// Complete the reservation (it will be replaced by actual AsyncBR in tracker)
+m.CompleteReservation(taskName)
 
 return m.tracker[taskName], nil

Note: This refactor would require unlocking before the call and re-locking after, or making CompleteReservation an internal unlocked helper, so the current inline approach is simpler and acceptable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 780367f and a7af320.

📒 Files selected for processing (10)
  • Makefile (1 hunks)
  • packages/system/velero/Makefile (2 hunks)
  • packages/system/velero/charts/velero/Chart.yaml (2 hunks)
  • packages/system/velero/charts/velero/templates/podmonitor.yaml (1 hunks)
  • packages/system/velero/charts/velero/templates/servicemonitor.yaml (1 hunks)
  • packages/system/velero/charts/velero/values.schema.json (0 hunks)
  • packages/system/velero/charts/velero/values.yaml (5 hunks)
  • packages/system/velero/images/velero/Dockerfile (1 hunks)
  • packages/system/velero/images/velero/patches/9447.diff (1 hunks)
  • packages/system/velero/values.yaml (1 hunks)
💤 Files with no reviewable changes (1)
  • packages/system/velero/charts/velero/values.schema.json
🚧 Files skipped from review as they are similar to previous changes (5)
  • packages/system/velero/values.yaml
  • Makefile
  • packages/system/velero/charts/velero/templates/servicemonitor.yaml
  • packages/system/velero/images/velero/Dockerfile
  • packages/system/velero/charts/velero/values.yaml
🧰 Additional context used
📓 Path-based instructions (2)
**/charts/**

📄 CodeRabbit inference engine (AGENTS.md)

Do not directly modify upstream charts; use vendoring approach instead

Files:

  • packages/system/velero/charts/velero/templates/podmonitor.yaml
  • packages/system/velero/charts/velero/Chart.yaml
**/Chart.yaml

📄 CodeRabbit inference engine (AGENTS.md)

Use Helm Charts with the umbrella pattern and vendor upstream charts in charts/ directory

Files:

  • packages/system/velero/charts/velero/Chart.yaml
🧠 Learnings (4)
📚 Learning: 2025-11-27T11:21:45.808Z
Learnt from: CR
Repo: cozystack/cozystack PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T11:21:45.808Z
Learning: Applies to **/Chart.yaml : Use Helm Charts with the umbrella pattern and vendor upstream charts in `charts/` directory

Applied to files:

  • packages/system/velero/charts/velero/Chart.yaml
  • packages/system/velero/Makefile
📚 Learning: 2025-11-27T11:21:45.808Z
Learnt from: CR
Repo: cozystack/cozystack PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T11:21:45.808Z
Learning: Applies to **/charts/** : Do not directly modify upstream charts; use vendoring approach instead

Applied to files:

  • packages/system/velero/Makefile
📚 Learning: 2025-07-14T16:23:12.803Z
Learnt from: NickVolynkin
Repo: cozystack/cozystack PR: 1196
File: packages/apps/http-cache/Makefile:24-27
Timestamp: 2025-07-14T16:23:12.803Z
Learning: In the cozystack repository, the `readme-generator` tool removes enum contents from values.schema.json files during its operation. Therefore, when using readme-generator in Makefiles, any enum values need to be injected back into the schema using yq commands after readme-generator has run, not before.

Applied to files:

  • packages/system/velero/Makefile
📚 Learning: 2025-07-23T09:15:09.658Z
Learnt from: lllamnyp
Repo: cozystack/cozystack PR: 1233
File: packages/system/hetzner-robotlb/charts/robotlb/templates/deployment.yaml:33-35
Timestamp: 2025-07-23T09:15:09.658Z
Learning: The `./charts/robotlb/` directory in the hetzner-robotlb package contains vendored code, and the team generally avoids modifying vendored code to maintain clean separation from upstream dependencies.

Applied to files:

  • packages/system/velero/Makefile
🪛 checkmake (0.2.2)
packages/system/velero/Makefile

[warning] 16-16: Target body for "image" exceeds allowed length of 5 (11).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build
🔇 Additional comments (7)
packages/system/velero/charts/velero/templates/podmonitor.yaml (1)

35-52: Excellent conditional rendering of optional monitoring fields.

The use of {{- with ... }} guards to conditionally render scrapeInterval, scrapeTimeout, scheme, metricRelabelings, relabelings, and tlsConfig is idiomatic Helm and provides good flexibility. Fields are omitted from the generated YAML when not specified in values.yaml, avoiding unnecessary null or empty values in the PodMonitor resource.

Indentation using nindent 6 is correct for fields nested under the podMetricsEndpoints list item, and toYaml is appropriately applied to structured fields.

packages/system/velero/images/velero/patches/9447.diff (5)

1-7: LGTM - Clear changelog entry.

The changelog entry accurately describes the race condition fix.


16-48: Well-structured reservation pattern prevents race condition.

The reservation lifecycle is correctly implemented:

  • Reserve slot before expensive GetExposed operation
  • Release reservation on all error paths (GetExposed failures, nil results, watcher creation errors)
  • Reservation is completed when CreateMicroServiceBRWatcher succeeds

This effectively prevents multiple tasks from wasting resources on GetExposed when the concurrent limit is already reached.


62-100: LGTM - Consistent reservation pattern with additional validation.

The implementation follows the same correct pattern as DataDownloadReconciler. The additional NodeOS validation (lines 83-85) properly releases the reservation before erroring out, maintaining the lifecycle integrity.


113-145: LGTM - Consistent reservation implementation.

The PodVolumeBackupReconciler correctly implements the reservation pattern, maintaining consistency with the other controllers. All error paths properly clean up reservations.


158-190: LGTM - Consistent reservation implementation.

The PodVolumeRestoreReconciler correctly implements the reservation pattern with proper error handling and cleanup on all paths.

packages/system/velero/Makefile (1)

4-4: The include path and variable exports are correct.

The file scripts/common-envs.mk exists and properly exports all three required variables: REGISTRY, TAG, and BUILDX_ARGS. The path ../../../scripts/common-envs.mk from packages/system/velero/Makefile correctly resolves to the file at the repository root.

@@ -1,5 +1,5 @@
apiVersion: v2
appVersion: 1.17.0
appVersion: 1.17.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== git ls-files matching velero Chart.yaml ==="
git ls-files 'packages/**/velero/**Chart.yaml' || true

echo
echo "=== show packages/system/velero/charts/velero/Chart.yaml (if exists) ==="
if [ -f packages/system/velero/charts/velero/Chart.yaml ]; then
  sed -n '1,120p' packages/system/velero/charts/velero/Chart.yaml
else
  echo "File not found"
fi

echo
echo "=== show packages/system/velero/Makefile (if exists) ==="
if [ -f packages/system/velero/Makefile ]; then
  sed -n '1,240p' packages/system/velero/Makefile
else
  echo "File not found: packages/system/velero/Makefile"
fi

echo
echo "=== grep for helm pull occurrences in repo ==="
rg "helm pull" -n --hidden || true

echo
echo "=== show top-level Makefile if exists ==="
if [ -f Makefile ]; then
  sed -n '1,240p' Makefile
else
  echo "No top-level Makefile"
fi

Repository: cozystack/cozystack

Length of output: 7860


Do not modify vendored Chart.yaml under charts/ (packages/system/velero/charts/velero/).

The Chart.yaml (appVersion/version changes) lives in a vendored chart directory that the Makefile repopulates with helm pull ... --untar --untardir charts; local edits will be overwritten and violate the repo's vendoring guideline. Use an umbrella/wrapper chart (e.g., packages/system/velero/Chart.yaml) to declare the vendored dependency and apply local overrides via values/templates, and keep the vendored charts/ copy unmodified. Adjust the update/image workflow to version the wrapper chart instead of editing charts/velero/Chart.yaml.

🤖 Prompt for AI Agents
packages/system/velero/charts/velero/Chart.yaml lines 2-2: this is a vendored
chart file that must not be edited directly because the Makefile repopulates
charts/ via `helm pull ... --untar --untardir charts`; revert any local edits to
this file, and instead update the umbrella/wrapper chart at
packages/system/velero/Chart.yaml (or add it if missing) to declare the vendored
chart as a dependency and to bump appVersion/version there; apply image or
configuration changes via the wrapper chart's values.yaml or templates (or via
values overrides in CI), and update the update/image workflow to change the
wrapper chart's version and values rather than modifying files under charts/.

Comment on lines +16 to +27
image:
docker buildx build images/velero \
--tag $(REGISTRY)/velero:$(call settag,$(TAG)) \
--cache-from type=registry,ref=$(REGISTRY)/velero:latest \
--cache-to type=inline \
--metadata-file images/velero.json \
$(BUILDX_ARGS)
REPOSITORY="$(REGISTRY)/velero" \
yq -i '.velero.image.repository = strenv(REPOSITORY)' values.yaml
TAG=$(TAG)@$$(yq e '."containerimage.digest"' images/velero.json -o json -r) \
yq -i '.velero.image.tag = strenv(TAG)' values.yaml
rm -f images/velero.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find values.yaml location in velero package
find packages/system/velero -name "values.yaml" -type f 2>/dev/null | head -20

Repository: cozystack/cozystack

Length of output: 146


🏁 Script executed:

#!/bin/bash
# Check the structure of velero package
ls -la packages/system/velero/ 2>/dev/null | head -30

Repository: cozystack/cozystack

Length of output: 472


🏁 Script executed:

#!/bin/bash
# Look for vendoring or chart references in velero Makefile
cat packages/system/velero/Makefile

Repository: cozystack/cozystack

Length of output: 983


🏁 Script executed:

#!/bin/bash
# Check if there's a Chart.yaml or chart definition
find packages/system/velero -name "Chart.yaml" -o -name "chart.yaml" -type f 2>/dev/null

Repository: cozystack/cozystack

Length of output: 144


🏁 Script executed:

#!/bin/bash
# Search for references to velero chart source or vendoring
rg "velero" packages/system/velero/ -A 3 -B 3 2>/dev/null | head -100

Repository: cozystack/cozystack

Length of output: 7190


🏁 Script executed:

#!/bin/bash
# Check the root Chart.yaml to understand wrapper structure
cat packages/system/velero/Chart.yaml

Repository: cozystack/cozystack

Length of output: 195


🏁 Script executed:

#!/antml/bash
# Check if values.yaml is in git and how it's tracked
git -C packages/system/velero/ ls-files values.yaml 2>/dev/null || echo "Not found or not in git"

Repository: cozystack/cozystack

Length of output: 74


🏁 Script executed:

#!/bin/bash
# Look at git status/history context for this file if available
git -C packages/system/velero/ log --oneline values.yaml 2>/dev/null | head -10 || echo "No git history available"

Repository: cozystack/cozystack

Length of output: 133


🏁 Script executed:

#!/bin/bash
# Search for similar patterns in other system packages - are they also modifying values.yaml via Makefile?
rg "yq -i.*values\.yaml" packages/system/ --type makefile 2>/dev/null

Repository: cozystack/cozystack

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Check the package.mk to understand the standard workflow
cat packages/system/velero/../../../scripts/package.mk 2>/dev/null | head -100

Repository: cozystack/cozystack

Length of output: 1313


🏁 Script executed:

#!/bin/bash
# Check if there's any documentation about the build workflow
find packages/system/velero -name "README*" -o -name "*.md" -type f 2>/dev/null

Repository: cozystack/cozystack

Length of output: 109


🏁 Script executed:

#!/bin/bash
# Look for comments in the Makefile about the workflow intent
head -50 packages/system/velero/Makefile

Repository: cozystack/cozystack

Length of output: 983


🏁 Script executed:

#!/bin/bash
# Check how the image target is used - is it in CI/CD or build process?
rg "make image" . --type makefile --type yaml --type yml -B 2 -A 2 2>/dev/null | head -50

Repository: cozystack/cozystack

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Check if values.yaml modifications are supposed to be committed
git -C packages/system/velero show HEAD:values.yaml 2>/dev/null | head -20

Repository: cozystack/cozystack

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Look for any build or CI workflow files that might use "make image"
find . -path ./node_modules -prune -o -type f \( -name "*.yml" -o -name "*.yaml" \) -print 2>/dev/null | xargs grep -l "make image" 2>/dev/null | head -10

Repository: cozystack/cozystack

Length of output: 45


Document the intended workflow for values.yaml modifications in the image target.

The image target (lines 24 and 26) uses yq -i to modify values.yaml in-place, creating several concerns that lack documentation:

  1. Non-idempotent modifications: Running make image multiple times will rewrite the file even when content is unchanged, with no safeguard against repeated edits.
  2. Unclear workflow intent: There is no documentation indicating whether these modifications should be committed to the repository, regenerated on each build, or handled differently.
  3. Unusual pattern: This approach is unique to velero and not used in other system packages, suggesting it may need review for consistency.

Clarify the expected workflow: Are values.yaml changes meant to be committed as part of the build process, or should the image target be refactored to avoid modifying tracked files? If committed, document this expectation. If not, consider separating image metadata from chart configuration.

🧰 Tools
🪛 checkmake (0.2.2)

[warning] 16-16: Target body for "image" exceeds allowed length of 5 (11).

(maxbodylength)

🤖 Prompt for AI Agents
In packages/system/velero/Makefile around lines 16 to 27, the image target
mutates tracked values.yaml in-place with yq without any documentation or
safeguards; either document the intended workflow and commit policy or stop
modifying the tracked file directly. Fix by doing one of: (A) document in the
Makefile header/README that the image target updates values.yaml and must be
committed (or add a --commit flag that performs the yq edits and git commit),
and make edits idempotent by reading current values and only writing when they
differ; or (B) avoid touching tracked files by writing image metadata into a
temporary values file or separate images metadata file and updating
Helm/packaging to use that file, leaving values.yaml unchanged; implement
whichever approach you choose and add a short comment in the Makefile describing
the decision and expected developer workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Morty Proxy This is a proxified and sanitized view of the page, visit original site.