-
Notifications
You must be signed in to change notification settings - Fork 36
fix(linkerd-cni): prevent parent from outpacing forked monitor_cni_config()
process
#585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…onfig()` process Signed-off-by: Simon Dickhoven <sdickhoven@everquote.com>
monitor_state=$( | ||
(ps --ppid=$monitor_pid -o comm=,state= || true) | | ||
awk '$1 == "inotifywait" && $2 == "S" {print "ok"}' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the monitor_state
variable will either be an empty string or it will contain (one or more) "ok".
...it should only be one "ok" because there is only one inotifywait
process that gets spawned by the monitor_cni_config()
function, but i wanted to make it very difficult for future changes to break this logic.
therefore, this logic will continue to work as expected even if multiple inotifywait
processes are spawned by monitor_cni_config()
.
the subsequent check that is responsible for breaking the infinite loop only cares about whether the monitor_state
variable is an empty string or not. so anything besides "ok" would work just as well.
i also experimented with only using ps
here:
$ ps --ppid=$monitor_pid -C inotifywait -o state=
however it appears as though ps
does not like multiple filters (1: --ppid=$monitor_pid
, 2: -C inotifywait
) and therefore ignores one of the two.
so i opted to have ps
emit all processes that are spawned by monitor_cni_config()
(which should be only inotifywait
) and then have awk
check if one of those processes is inotifywait
and whether that process is in the state S
(="interruptible sleep") which should indicate that it is not in the process of actively starting up but is instead waiting for filesystem events to come in.
https://www.man7.org/linux/man-pages/man1/ps.1.html#PROCESS_STATE_CODES
the =
after the output column name (comm=,state=
) tells ps
to omit the header for that column. this is not strictly necessary for the logic to work but we don't need the headers so i added it.
$ ps -o comm,state
COMMAND S
bash S
ps R
vs
$ ps -o comm=,state=
bash S
ps R
the || true
is there because if ps
doesn't find any processes (which is the whole reason for this logic to exist in the first place), it exits with a non-zero exit code (which would then abort the entire script... which we don't want).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could get much more elaborate here and use the filesystem for ipc between the main process and the forked monitor_cni_config()
process but that would require a lot more logic.
so i opted to keep things relatively straight-forward.
awk '$1 == "inotifywait" && $2 == "S" {print "ok"}' | ||
) | ||
[ -z "$monitor_state" ] || break | ||
sleep .1 # 100ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that i saw a ~140ms race in production, i figured that waiting in increments of 100ms is reasonable.
i tested the logic with the following script #!/usr/bin/env bash
set -e
#inotifywait -m /tmp &
myfunc() {
sleep 2
inotifywait -m .
}
myfunc &
monitor_pid=$!
while true; do
monitor_state=$(
(ps --ppid=$monitor_pid -o comm=,state= || true) |
awk '$1 == "inotifywait" && $2 == "S" {print "ok"}'
)
echo "tick $monitor_state"
[ -z "$monitor_state" ] || break
sleep .1 # 100ms
done
echo done
wait -n i also threw in a second it is also possible to see the new logic in action by inserting a |
Thanks for the very detailed overview. We should be able to take a deeper look at this next week. |
this pr addresses a race condition in which the parent process outpaces the forked
monitor_cni_config()
process.this race condition was observed in a production cluster and can clearly be seen in the log output:
as can be seen above, the
inotifywait
is about 140ms late to the party.the result is that existing cni config files are not patched which then causes the repair controller to incessantly "murder" pods that start up on the doomed worker node.
i should add that we've been running linkerd-cni v1.6.3 (which includes the fix for previously encountered race conditions) since 16 jul 2025 and v1.6.4 since 15 sep 2025 without a problem. yesterday was the first time we've seen this (new) race condition.