Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

apiserver crashes on concurrent map iteration and write #120507

Copy link
Copy link
Closed
@natherz97

Description

@natherz97
Issue body actions

What happened?

During periods when there's webhook failures, it is possible for the apiserver to crash while generating an audit log entry for the given request. Specifically, when a large number of pods are deleted that are intercepted by a mutating webhook that is timing out, the following fatal error can be encountered:

E0905 18:47:26.898105      11 dispatcher.go:184] failed calling webhook "webhook.com": failed to call webhook: Post "https://webhook:443/mutate?timeout=12s": context deadline exceeded
fatal error: concurrent map iteration and map write

goroutine 295461 [running]:
reflect.mapiterinit(0x451512?, 0x200?, 0x48da4c0?)
    /usr/local/go/src/runtime/map.go:1375 +0x19
reflect.(*MapIter).Next(0x4528ac0?)
    /usr/local/go/src/reflect/value.go:1919 +0x65
encoding/json.mapEncoder.encode({0xc05cea2cf8?}, 0xc06363ae00, {0x4528ac0?, 0xc0b2bdf170?, 0xc0b2bdf0b0?}, {0xe?, 0x0?})
    /usr/local/go/src/encoding/json/encode.go:797 +0x33e
encoding/json.structEncoder.encode({{{0xc0208ea000?, 0x3dfdbc4?, 0x5?}, 0xc041e42b40?}}, 0xc06363ae00, {0x4e6fa60?, 0xc0b2bdf040?, 0x3dfdbc4?}, {0x0, 0x1})
    /usr/local/go/src/encoding/json/encode.go:759 +0x1f4
encoding/json.ptrEncoder.encode({0x0?}, 0xc06363ae00, {0x4e16cc0?, 0xc0b2bdf040?, 0x4e16cc0?}, {0xb6?, 0x68?})
    /usr/local/go/src/encoding/json/encode.go:943 +0x21c
encoding/json.(*encodeState).reflectValue(0xc05cea2fb0?, {0x4e16cc0?, 0xc0b2bdf040?, 0xc0ab3181a0?}, {0x60?, 0x2c?})
    /usr/local/go/src/encoding/json/encode.go:358 +0x78
encoding/json.(*encodeState).marshal(0x58d8a20?, {0x4e16cc0?, 0xc0b2bdf040?}, {0xb9?, 0xe?})
    /usr/local/go/src/encoding/json/encode.go:330 +0xfa
encoding/json.(*Encoder).Encode(0xc05cea30d0, {0x4e16cc0, 0xc0b2bdf040})
    /usr/local/go/src/encoding/json/stream.go:209 +0xf3
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/json.(*Serializer).doEncode(0x9cc5cf?, {0x58e09e0?, 0xc0b2bdf040?}, {0x58c99e0, 0xc0b2c2fc20})
    vendor/k8s.io/apimachinery/pkg/runtime/serializer/json/json.go:246 +0x19a
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/json.(*Serializer).Encode(0xc000344c30, {0x58e09e0, 0xc0b2bdf040}, {0x58c99e0, 0xc0b2c2fc20})
    vendor/k8s.io/apimachinery/pkg/runtime/serializer/json/json.go:220 +0xfc
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).doEncode(0xc00093dc20, {0x58e0940, 0xc04f459a40}, {0x58c99e0, 0xc0b2c2fc20}, {0x0?, 0x0?})
    vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning/versioning.go:284 +0xa12
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).encode(0xc00093dc20, {0x58e0940, 0xc04f459a40}, {0x58c99e0, 0xc0b2c2fc20}, {0x0?, 0x0?})
    vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning/versioning.go:214 +0x167
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning.(*codec).Encode(0xc05cb67d10?, {0x58e0940?, 0xc04f459a40?}, {0x58c99e0?, 0xc0b2c2fc20?})
    vendor/k8s.io/apimachinery/pkg/runtime/serializer/versioning/versioning.go:207 +0x33
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime.Encode({0x7f407ed689c8, 0xc00093dc20}, {0x58e0940, 0xc04f459a40})
    vendor/k8s.io/apimachinery/pkg/runtime/codec.go:50 +0x64
k8s.io/kubernetes/vendor/k8s.io/apiserver/plugin/pkg/audit/log.(*backend).logEvent(0xc000676750, 0xc04f459a40)
    vendor/k8s.io/apiserver/plugin/pkg/audit/log/backend.go:76 +0x73
k8s.io/kubernetes/vendor/k8s.io/apiserver/plugin/pkg/audit/log.(*backend).ProcessEvents(0x40e167?, {0xc09db7fcf8, 0x1, 0xc0b2bc6d01?})
    vendor/k8s.io/apiserver/plugin/pkg/audit/log/backend.go:65 +0x45
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/options.(*ignoreErrorsBackend).ProcessEvents(0x59013c0?, {0xc09db7fcf8?, 0x40e167?, 0x98?})
    vendor/k8s.io/apiserver/pkg/server/options/audit.go:395 +0x26
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.processAuditEvent({0x59013c0, 0xc05cb67d10}, {0x7f407ed6fd98, 0xc0012a1f40}, 0xc04f459a40, {0xc0012a1ed0?, 0x1, 0x8337e00?})
    vendor/k8s.io/apiserver/pkg/endpoints/filters/audit.go:202 +0x24f
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAudit.func1.1()
    vendor/k8s.io/apiserver/pkg/endpoints/filters/audit.go:115 +0x41d
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithAudit.func1({0x5900650?, 0xc0ab318160}, 0xc098e8e800)
    vendor/k8s.io/apiserver/pkg/endpoints/filters/audit.go:118 +0x5c5
net/http.HandlerFunc.ServeHTTP(0xa?, {0x5900650?, 0xc0ab318160?}, 0xc0612f1a80?)
    /usr/local/go/src/net/http/server.go:2122 +0x2f
goroutine 295461 [running]:

This cluster is running v1.24.16. The panic is occurring on this line while generating the audit log entry: https://github.com/kubernetes/kubernetes/blob/v1.24.16/staging/src/k8s.io/apimachinery/pkg/runtime/serializer/json/json.go#L246

What did you expect to happen?

A panic should not be possible from the audit handler.

How can we reproduce it (as minimally and precisely as possible)?

We do not have a method to reproduce this deterministically. This has occurred during periods where 1000+ pods are deleted which are intercepted by a mutating webhook that is timing out.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.16-eks-2d98532", GitCommit:"af930c12e26ef9d1e8fac7e3532ff4bcc1b2b509", GitTreeState:"clean", BuildDate:"2023-07-28T16:56:16Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.16-eks-2d98532", GitCommit:"af930c12e26ef9d1e8fac7e3532ff4bcc1b2b509", GitTreeState:"clean", BuildDate:"2023-07-28T16:52:47Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

AWS

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.sig/api-machineryCategorizes an issue or PR as relevant to SIG API Machinery.Categorizes an issue or PR as relevant to SIG API Machinery.sig/authCategorizes an issue or PR as relevant to SIG Auth.Categorizes an issue or PR as relevant to SIG Auth.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

Status

Closed / Done
Show more project fields

Status

Done
Show more project fields

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Morty Proxy This is a proxified and sanitized view of the page, visit original site.