FUSE mounts in emptyDir volumes cannot be cleaned

What happened?

This issue is to summarize some conversation towards the end of #7890, as requested by @thockin.

If an application in a privileged pod or container creates a FUSE mount in an emptyDir volume, but fails to unmount it before terminating (either due to that being a conscious choice by the application, or due to a SIGKILL from kubernetes), the kubelet will fail to clean up the pod. A recurring error will appear in the kubelet logs, and the pod will remain in API.

Here is an example error log from the kubelet during cleanup:

Jan 08 19:06:04 <hostname omitted> kubelet[12511]: E0108 19:06:04.507950   12511 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/empty-dir/30b506e8-b18a-4d5c-bf7d-17fbae54a5d0-worker podName:30b506e8-b18a-4d5c-bf7d-17fbae54a5d0 nodeName:}" failed. No retries permitted until 2025-01-08 19:08:06.507933266 +0000 UTC m=+1437.970062341 (durationBeforeRetry 2m2s). Error: UnmountVolume.TearDown failed for volume "worker" (UniqueName: "kubernetes.io/empty-dir/30b506e8-b18a-4d5c-bf7d-17fbae54a5d0-worker") pod "30b506e8-b18a-4d5c-bf7d-17fbae54a5d0" (UID: "30b506e8-b18a-4d5c-bf7d-17fbae54a5d0") : openfdat /var/lib/kubelet/pods/30b506e8-b18a-4d5c-bf7d-17fbae54a5d0/volumes/kubernetes.io~empty-dir/worker/build: transport endpoint is not connected

The offending code seems to be here: https://github.com/kubernetes/kubernetes/blob/release-1.31/pkg/volume/emptydir/empty_dir.go#L490-L495

When cleaning up emptyDirs, we start with os.RemoveAll, as it's recursing through the directory, it will eventually try to inspect the contents of the FUSE mount, which will result in an error.

What did you expect to happen?

The kubelet is able to clean the pod up eventually.

How can we reproduce it (as minimally and precisely as possible)?

Run a privileged container that can generate a FUSE mount within an emptydir
Configure the application to not clean the FUSE mount, or forcefully terminate the pod so the mount cannot be cleaned.

Anything else we need to know?

I'd be happy to try and put a patch together to address this with a little guidance. It seems like we should be able to inspect for any mounts beneath the empty directory's MetaDir and umount them before we attempt to call the os.RemoveAll.

Kubernetes version

```console Client Version: v1.30.5 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.31.3-eks-56e63d8 ```

Cloud provider

AWS EKS

OS version

```console # On Linux: $ cat /etc/os-release NAME=Bottlerocket ID=bottlerocket VERSION="1.29.0 (aws-k8s-1.31)" PRETTY_NAME="Bottlerocket OS 1.29.0 (aws-k8s-1.31)" VARIANT_ID=aws-k8s-1.31 VERSION_ID=1.29.0 BUILD_ID=c55d099c HOME_URL="https://github.com/bottlerocket-os/bottlerocket" SUPPORT_URL="https://github.com/bottlerocket-os/bottlerocket/discussions" BUG_REPORT_URL="https://github.com/bottlerocket-os/bottlerocket/issues" DOCUMENTATION_URL="https://bottlerocket.dev" $ uname -a # paste output here Linux 6.1.119 #1 SMP Thu Dec 12 20:00:51 UTC 2024 aarch64 GNU/Linux ```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FUSE mounts in emptyDir volumes cannot be cleaned #129550

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

FUSE mounts in emptyDir volumes cannot be cleaned #129550

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions