Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Add log file rotation #127667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions 71 staging/src/k8s.io/component-base/logs/kube-log-runner/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,66 @@ Why do we need this?
- Nowadays, the `--log-file` parameter is deprecated for Kubernetes components
and should not be used anymore. `kube-log-runner` is a direct replacement.

## Flags

- -flush-interval
The `-flush-interval` flag is a duration flag that specifies how frequently the log
file content is flushed to disk. The default value is 0, meaning no periodic flushing.
If set to a non-zero value, the log file is flushed at the specified interval.

Type: Duration

Default: 0 (flushing disabled)

Usage: When set to non-zero, the log file is flushed every specified interval.
While this may not be necessary on Linux, on Windows, it ensures that recent log
entries are written to disk in near real-time, which is particularly useful for
users who need to monitor logs as they are generated without delay.

- -log-file-size
The `-log-file-size` flag is an optional string flag that sets a size limit for
the log file, triggering automatic log rotation when the specified size is reached.
This is especially useful in production environments where the log file may become
too large to view effectively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
too large to view effectively.
too large to view effectively.
Beware that rotation can happen at arbitrary points in the byte stream emitted by the command.
This can lead to splitting a log entry into two parts, with the first part in one file
and the second part in the next file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not forget about the problem.

I created https://github.com/kubernetes/kubernetes/compare/master...pohly:kubernetes:log-runner-torn-writes?expand=1 to demonstrate it and #131788 for tracking.

Beware that rotation can happen at arbitrary points in the byte stream emitted by the command.
This can lead to splitting a log entry into two parts, with the first part in one file
and the second part in the next file.

Type: String (expects a value in Resource.Quantity format, such as 10M or 500K)

Default: "0" (disabled, no automatic rotation of log files)

Usage: When set to a positive value, the log file will rotate upon reaching the specified
size limit. The current log file’s contents will be saved to a backup file, and a new log
file will be created at the path specified by the `-log-file` flag, ready for future log entries.

Backup File Naming Convention:
`<original-file-name>-<timestamp><file-extension>`.
* `<original-file-name>`: The name of the original log file, without the file extension.
* `<timestamp>`: A timestamp is added to each backup file’s name to uniquely identify it
based on the time it was created. The timestamp follows the format "20060102-150405".
For example, a backup created on June 2, 2006, at 3:04:05 PM would include this timestamp.
* `<file-extension>`: The original file’s extension (e.g., .log) remains unchanged.
This naming convention ensures easy organization and retrieval of rotated log files based on their creation time.

- -log-file-age
The `-log-file-age` flag is an optional time duration setting that defines how long
old backup log files are retained. This flag is used alongside log rotation (enabled
by setting a positive value for -log-file-size) to help manage storage by removing
outdated backup logs.

Type: Duration

Default: 0 (disabled, no automatic deletion of backup files)

Usage: When -log-file-age is set to a positive duration (e.g., 24h for 24 hours)
and log rotation is enabled, backup log files will be automatically deleted if
the time when they were created (as encoded in the file name) is older than the
specified duration from the current time.

This ensures that only recent backup logs are kept, preventing accumulation of old logs
and reducing storage usage.

For example instead of running kube-apiserver like this:
```bash
"/bin/sh",
Expand Down Expand Up @@ -49,6 +109,17 @@ kube-log-runner -log-file=/tmp/log echo "hello world"
# Copy into log file and print to stdout (same as 2>&1 | tee -a /tmp/log).
kube-log-runner -log-file=/tmp/log -also-stdout echo "hello world"

# Copy into log file and print to stdout (same as 2>&1 | tee -a /tmp/log),
# will flush the logging file in 5s,
# rotate the log file when its size exceedes 10 MB
kube-log-runner -flush-interval=5s -log-file=/tmp/log -log-file-size=10M -also-stdout echo "hello world"

# Copy into log file and print to stdout (same as 2>&1 | tee -a /tmp/log),
# will flush the logging file in 10s,
# rotate the log file when its size exceedes 10 MB,
# and clean up old rotated log files when their age are older than 168h (7 days)
kube-log-runner -flush-interval=10s -log-file=/tmp/log -log-file-size=10M -log-file-age=168h -also-stdout echo "hello world"

# Redirect only stdout into log file (same as 1>>/tmp/log).
kube-log-runner -log-file=/tmp/log -redirect-stderr=false echo "hello world"
```
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
/*
Copyright 2025 The Kubernetes Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package logrotation

import (
"io"
"os"
"path/filepath"
"strings"
"sync"
"time"
)

const timeLayout string = "20060102-150405"

type rotationFile struct {
// required, the max size of the log file in bytes, 0 means no rotation
maxSize int64
// required, the max age of the log file, 0 means no cleanup
maxAge time.Duration
filePath string
mut sync.Mutex
file *os.File
currentSize int64
lasSyncTime time.Time
flushInterval time.Duration
}

func Open(filePath string, flushInterval time.Duration, maxSize int64, maxAge time.Duration) (io.WriteCloser, error) {
w := &rotationFile{
filePath: filePath,
maxSize: maxSize,
maxAge: maxAge,
flushInterval: flushInterval,
}

logFile, err := os.OpenFile(w.filePath, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
return nil, err
}

w.file = logFile

w.lasSyncTime = time.Now()

if w.maxSize > 0 {
info, err := os.Stat(w.filePath)
if err != nil {
return nil, err
}
w.currentSize = info.Size()
}

return w, nil
}

// Write implements the io.Writer interface.
func (w *rotationFile) Write(p []byte) (n int, err error) {
w.mut.Lock()
defer w.mut.Unlock()

n, err = w.file.Write(p)
if err != nil {
return 0, err
}

if w.flushInterval > 0 && time.Since(w.lasSyncTime) >= w.flushInterval {
err = w.file.Sync()
if err != nil {
return 0, err
}
w.lasSyncTime = time.Now()
}

if w.maxSize > 0 {
w.currentSize += int64(len(p))

// if file size over maxsize rotate the log file
if w.currentSize >= w.maxSize {
err = w.rotate()
if err != nil {
return 0, err
}
}
}

return n, nil
}

func (w *rotationFile) rotate() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether I asked about it here already: can we somehow ensure that rotation doesn't cut a log message into two halves? If that happens, the end of one file and start of the next cannot be parsed.

One possible solution would be for logrunner to set up a Unix domain datagram socket, then read entire messages from it.

This wasn't necessary before because it was just a bit pipe, but with log rotation it becomes a router for log messages.

cc @harshanarayana

Relevant in the context of kubernetes/enhancements#5022.

Copy link
Contributor Author

@zylxjtu zylxjtu Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context! There will be performance tax by adding a datagram layer? If we want to add that, I think we should make it to be optional so that people who do not care much about the splitting can avoid some performance loss?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, any high-level idea on how to test this?

Copy link
Contributor Author

@zylxjtu zylxjtu Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I feel a little confused here. Can you please help to explain how a writing to stderr/stdout can result into halves messages with the rotation? if the write() interface being called holds the "original" message boundary? if not, this message boundary can't be guaranteed by unix datagram neither as the message in write() call is the visible message boundary for datagram? Unless we have some info on application level message boundary, this does not seem possible? Please let me know if I missed anything @pohly @harshanarayana? An example will be much helpful, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With normal pipes, the writer might do a single write with 1024 bytes containing one log entry. The kernels is free to split those bytes up however it likes, so logrunner reading into a 1024 byte buffer might only get 512 bytes, write those to one file, rotate, and then write the next 512 of that log entry in the next file.

I would need to read up on Unix Datagram, but I'm pretty sure it avoids that. If I remember correctly, the reader can determine what the size of the next datagram is. How well this is supported by Go is yet another open question.

There will be performance tax by adding a datagram layer?

I don't think so (it's still all in kernel memory), but it would be worthwhile to benchmark it.

Also, any high-level idea on how to test this?

A stress might be able to show this happening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we just coordinate on newlines?
When we get within some tolerance of the max file size, mark as pending rotation, at the next incoming data split on newline, keep the rest of the buffer, rotate, write to new file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's continue in the issue: #131788 (comment)

// Get the file extension
ext := filepath.Ext(w.filePath)

// Remove the extension from the filename
pathWithoutExt := strings.TrimSuffix(w.filePath, ext)

rotateFilePath := pathWithoutExt + "-" + time.Now().Format(timeLayout) + ext

if w.filePath == rotateFilePath {
return nil
}

err := w.file.Close()
if err != nil {
return err
}

err = os.Rename(w.filePath, rotateFilePath)
if err != nil {
return err
}

w.file, err = os.OpenFile(w.filePath, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
return err
}

w.currentSize = 0

if w.maxAge > 0 {
go func() {
err = w.clean(pathWithoutExt, ext)
}()
}

return nil
}

// Clean up the old log files in the format of
// <basename>-<timestamp><ext>.
// This should be safe enough to avoid false deletion.
// This will work for multiple restarts of the same program.
func (w *rotationFile) clean(pathWithoutExt string, ext string) error {
ageTime := time.Now().Add(-w.maxAge)

directory := filepath.Dir(pathWithoutExt)
basename := filepath.Base(pathWithoutExt) + "-"

dir, err := os.ReadDir(directory)
if err != nil {
return err
}

err = nil
for _, v := range dir {
if strings.HasPrefix(v.Name(), basename) && strings.HasSuffix(v.Name(), ext) {
// Remove the prefix and suffix
trimmed := strings.TrimPrefix(v.Name(), basename)
trimmed = strings.TrimSuffix(trimmed, ext)

_, err = time.Parse(timeLayout, trimmed)
if err == nil {
info, errInfo := v.Info()
if errInfo != nil {
err = errInfo
// Ignore the error while continue with the next clenup
continue
}

if ageTime.After(info.ModTime()) {
err = os.Remove(filepath.Join(directory, v.Name()))
if err != nil {
// Ignore the error while continue with the next clenup
continue
}
}
}

}
}

return err
}

func (w *rotationFile) Close() error {
w.mut.Lock()
defer w.mut.Unlock()

// Explicitly call file.Sync() to ensure data is written to disk
err := w.file.Sync()
if err != nil {
return err
}

err = w.file.Close()
if err != nil {
return err
}

return nil
}
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.