Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Add log file rotation #127667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2025
Merged

Add log file rotation #127667

merged 1 commit into from
May 15, 2025

Conversation

zylxjtu
Copy link
Contributor

@zylxjtu zylxjtu commented Sep 26, 2024

What type of PR is this?

This is to enhance the kube-log-runner

/kind bug

What this PR does / why we need it:

Initial log-runner does not have log flushing, so the logs will not be able to be flushed in time. Add the flushing (every 5 seconds)

Added the (optional) log rotation, will rotate the logs when the log size exceed the maximum size (configurable), will also clean old logs which age exceed the time (configurable)

Keep the file open for each write, close/reopen the fd only when needed during log rotation for better performance

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

kube-log-runner: rotating log output into a new file when reaching a certain file size can be requested via the new `-log-file-size` parameter. `-log-file-age` enables automatical removal of old output files.  Periodic flushing can be requested through ` -flush-interval`.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 26, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @zylxjtu. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. wg/structured-logging Categorizes an issue or PR as relevant to WG Structured Logging. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 26, 2024
@zylxjtu zylxjtu changed the title Add log rotator Add log file rotation Sep 26, 2024
@jsturtevant
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 30, 2024
@dgrisonnet
Copy link
Member

/triage accepted
/assign @mengjiao-liu

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 3, 2024
@zylxjtu
Copy link
Contributor Author

zylxjtu commented Oct 14, 2024

Can someone please help to take a look at this PR? @mengjiao-liu

Copy link
Member

@mengjiao-liu mengjiao-liu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, this file lacks the corresponding test file. After all, it has quite a lot of changes.

Then,I used the following command for simulation, but the result is not as expected.
So I suggest you add the corresponding test file for testing and run it to try.

dd if=/dev/zero of=/tmp/test.log count=1 bs=1M
ls -l /tmp/test.log  
cd <kube-log-runner-command-dir>
./kube-log-runner -log-file=/tmp/test.log -log-file-size=1 -also-stdout echo "hello world"

ls -l /tmp/test.*
-rw-r--r--  1 lmj  wheel  1048588 Oct 15 17:06 test-20241015-170629.log
-rw-r--r--  1 lmj  wheel        0 Oct 15 17:06 test.log   # According to expectations, there should be an output of "hello world" here. But now there isn't. It's 0. 

)

var (
logFilePath = flag.String("log-file", "", "If non-empty, save stdout to this file")
logFileSize = flag.Uint("log-file-size", 0, "Useful with log-file, if non-zero, rotate log file when it reaches this size in MB")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for users to set the byte unit by themselves? For example, 10 MB or 10GB, instead of only having MB as a fixed unit? 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the file size to be zero, this is actually one of the issues this PR would like to tackle with, that the log file was not able to be refreshed/synced in time. 5s was added in this PR to refresh the logs, so there will be delay but it will be synced much quicker than the original one

Copy link
Contributor Author

@zylxjtu zylxjtu Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to keep the parameters of "kube-log-runner" to be as simple as possible. This looks to me important for the case if the process loaded by "kube-log-runner" has a lot of parameters itself, such as "kubelet". Also the purpose of log file rotation is for the convenience of users to check the text log file. GB size log file does not seem easy for use to look through. So after second thoughts, I would prefer to keep the options to be MB only

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ambiguous (size in B, KB, or MB?) command line flags are not simple because the user has to be very careful about using them correctly. If we do this, then we should do it properly, which means accepting a string for a resource.Quantity and parsing it accordingly.

The same argument applies to age: it should accept a time.Duration. There's even a flag.Duration for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to me important for the case if the process loaded by "kube-log-runner" has a lot of parameters itself,

I don't understand this argument. Why should the number of arguments for the process influence what parameters get accepted by kube-log-runner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what I mean is prefer the option of avoiding complicated parameters if we can achieve the similar functionality

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the log-file-size and log-file-age format, to make it more explicitly.

)

var (
logFilePath = flag.String("log-file", "", "If non-empty, save stdout to this file")
logFileSize = flag.Uint("log-file-size", 0, "Useful with log-file, if non-zero, rotate log file when it reaches this size in MB")
logFileAge = flag.Uint("log-file-age", 0, "Useful with log-file-size, if non-zero, remove log files older than this many days")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the above suggestion, can we let users define whether it's a few hours, days or months(e.g. 8h,8d)? Instead of a fixed unit with days as the unit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the above comment, there is a trad off between the flexibility and complexity here. I would prefer to make the parameters to be simple if possible

w.currentSize += int64(len(p))

// if file size over maxsize rotate the log file
if (w.currentSize / 1000000) >= int64(w.maxSize) {
Copy link
Member

@mengjiao-liu mengjiao-liu Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the file size check for rotation is incorrect because it divides the size by 1,000,000 instead of using the correct conversion for megabytes (1 MB = 1024*1024 bytes). In addition, using division for byte conversion can result in an integer division error; instead, multiplication can be used here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update.

@zylxjtu
Copy link
Contributor Author

zylxjtu commented Oct 28, 2024

@mengjiao-liu and others, please help to take a look, I wish it can catch the k8s 1.32 release train if possible, thanks

@github-project-automation github-project-automation bot moved this from not-only-sig-node to Waiting on Author in SIG Node: code and documentation PRs May 14, 2025
@github-project-automation github-project-automation bot moved this from Needs Triage to In Progress in SIG CLI May 14, 2025
@github-project-automation github-project-automation bot moved this from Needs Triage to In Progress in SIG Apps May 14, 2025
@pohly
Copy link
Contributor

pohly commented May 14, 2025

Please squash into one commit as part of the next push.

@zylxjtu
Copy link
Contributor Author

zylxjtu commented May 14, 2025

/remove-area/apiserver

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented May 14, 2025

@zylxjtu: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-apidiff-client-go 6465b2b link false /test pull-kubernetes-apidiff-client-go

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@zylxjtu
Copy link
Contributor Author

zylxjtu commented May 14, 2025

/test pull-kubernetes-e2e-kind

@pohly
Copy link
Contributor

pohly commented May 15, 2025

/skip

@pohly
Copy link
Contributor

pohly commented May 15, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 15, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 8a98dc301d65f4b0fb7c38b93a0fa3e6f7fd0dd3

@pohly
Copy link
Contributor

pohly commented May 15, 2025

Please add a release note in the description.

@pohly
Copy link
Contributor

pohly commented May 15, 2025

/approve

Please add a release note in the description.

Never mind, I can do that myself:

/release-note-edit

kube-log-runner: rotating log output into a new file when reaching a certain file size can be requested via the new `-log-file-size` parameter. `-log-file-age` enables automatical removal of old output files.  Periodic flushing can be requested through ` -flush-interval`..

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 15, 2025
@pohly
Copy link
Contributor

pohly commented May 15, 2025

/release-note-edit

kube-log-runner: rotating log output into a new file when reaching a certain file size can be requested via the new `-log-file-size` parameter. `-log-file-age` enables automatical removal of old output files.  Periodic flushing can be requested through ` -flush-interval`.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pohly, zylxjtu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 15, 2025
@k8s-ci-robot k8s-ci-robot merged commit 35956e9 into kubernetes:master May 15, 2025
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.34 milestone May 15, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in SIG Apps May 15, 2025
@github-project-automation github-project-automation bot moved this from Changes Requested to Closed / Done in SIG Auth May 15, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in SIG CLI May 15, 2025
@github-project-automation github-project-automation bot moved this from Waiting on Author to Done in SIG Node: code and documentation PRs May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/code-generation area/dependency Issues or PRs related to dependency changes area/ipvs area/kube-proxy area/kubeadm area/kubectl area/kubelet area/provider/gcp Issues or PRs related to gcp provider area/release-eng Issues or PRs related to the Release Engineering subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. wg/structured-logging Categorizes an issue or PR as relevant to WG Structured Logging.
Projects
Archived in project
Status: Done
Status: Closed / Done
Status: Done
Development

Successfully merging this pull request may close these issues.

8 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.