fsmonitor: implement filesystem change listener for Linux#2147
fsmonitor: implement filesystem change listener for Linux#2147ptarjan wants to merge 10 commits intogit:mastergit/git:masterfrom ptarjan:claude/update-pr-1352-current-85Gk8ptarjan/git:claude/update-pr-1352-current-85Gk8Copy head branch name to clipboard
Conversation
Welcome to GitGitGadgetHi @ptarjan, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests. Please make sure that either:
You can CC potential reviewers by adding a footer to the PR description with the following syntax: NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description, Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:
It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code. Contributing the patchesBefore you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form Both the person who commented An alternative is the channel Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment If you want to see what email(s) would be sent for a After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail). If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the curl -g --user "<EMailAddress>:<Password>" \
--url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txtTo iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description): To send a new iteration, just add another PR comment with the contents: Need help?New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join. You may also be able to find help in real time in the developer IRC channel, |
|
/submit |
|
Error: User ptarjan is not yet permitted to use GitGitGadget |
0fab68b to
e079635
Compare
|
/allow |
|
User ptarjan is now allowed to use GitGitGadget. |
5f45d96 to
aa60645
Compare
|
/submit |
|
Submitted as pull.2147.git.git.1767082450088.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
On the Git mailing list, Junio C Hamano wrote (reply to this): Exciting.
It seems to die with leaks when "make SANITIZE=leak test" is run,
though.
Initialized empty Git repository in /home/gitster/w/git.git/t/trash directory.t7527-builtin-fsmonitor/test_implicit/.git/
fsmonitor-daemon is not watching '/home/gitster/w/git.git/t/trash directory.t7527-builtin-fsmonitor/test_implicit'
builtin:0.145521.20251230T113644.793433Z:0Q/Q
{"event":"data","sid":"20251230T113644.762195Z-H3cfff1b1-P0002386f","thread":"main","time":"2025-12-30T11:36:44.813131Z","file":"fsmonitor-ipc.c","line":99,"t_abs":0.052581,"t_rel":0.048830,"nesting":2,"category":"fsm_client","key":"query/response-length","value":"45"}
fsmonitor-daemon is watching '/home/gitster/w/git.git/t/trash directory.t7527-builtin-fsmonitor/test_implicit'
fsmonitor-daemon is not watching '/home/gitster/w/git.git/t/trash directory.t7527-builtin-fsmonitor/test_implicit'
fatal: fsmonitor--daemon is not running
not ok 2 - implicit daemon start
#
# test_when_finished "stop_daemon_delete_repo test_implicit" &&
#
# git init test_implicit &&
# test_must_fail git -C test_implicit fsmonitor--daemon status &&
#
# # query will implicitly start the daemon.
# #
# # for test-script simplicity, we send a V1 timestamp rather than
# # a V2 token. either way, the daemon response to any query contains
# # a new V2 token. (the daemon may complain that we sent a V1 request,
# # but this test case is only concerned with whether the daemon was
# # implicitly started.)
#
# GIT_TRACE2_EVENT="$PWD/.git/trace" \
# test-tool -C test_implicit fsmonitor-client query --token 0 >actual &&
# nul_to_q <actual >actual.filtered &&
# grep "builtin:" actual.filtered &&
#
# # confirm that a daemon was started in the background.
# #
# # since the mechanism for starting the background daemon is platform
# # dependent, just confirm that the foreground command received a
# # response from the daemon.
#
# have_t2_data_event fsm_client query/response-length <.git/trace &&
#
# git -C test_implicit fsmonitor--daemon status &&
# git -C test_implicit fsmonitor--daemon stop &&
# test_must_fail git -C test_implicit fsmonitor--daemon status
#
1..2
=================================================================
==git==145489==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 512 byte(s) in 1 object(s) allocated from:
#0 0x56179e6ce042 in calloc (git+0x8c042) (BuildId: de5ce3c9d0b0c09380c910e6a9eb181e324abde6)
#1 0x56179ea0aef4 in xcalloc wrapper.c:154:8
#2 0x56179e8b17b7 in alloc_table hashmap.c:79:2
#3 0x56179e8b174c in hashmap_init hashmap.c:168:2
#4 0x56179e73e6fe in fsmonitor_run_daemon builtin/fsmonitor--daemon.c:1288:2
#5 0x56179e73e141 in try_to_run_foreground_daemon builtin/fsmonitor--daemon.c:1448:11
#6 0x56179e73dc44 in cmd_fsmonitor__daemon builtin/fsmonitor--daemon.c:1584:12
#7 0x56179e6d2c8a in run_builtin git.c:506:11
#8 0x56179e6d1910 in handle_builtin git.c:779:9
#9 0x56179e6d2747 in run_argv git.c:862:4
#10 0x56179e6d169b in cmd_main git.c:984:19
#11 0x56179e7f7a7a in main common-main.c:9:11
#12 0x7f091ea66ca7 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#13 0x7f091ea66d64 in __libc_start_main csu/../csu/libc-start.c:360:3
#14 0x56179e69e280 in _start (git+0x5c280) (BuildId: de5ce3c9d0b0c09380c910e6a9eb181e324abde6) |
aa60645 to
7a7fe25
Compare
|
/submit |
|
Submitted as pull.2147.v2.git.git.1767096494372.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
7a7fe25 to
207da68
Compare
|
/submit |
|
Submitted as pull.2147.v3.git.git.1767099302592.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
On the Git mailing list, Junio C Hamano wrote (reply to this): Thanks for a quick turnaround, but it would be more efficient if you
hunted all the leaks yourself, instead of getting a report for one
issue and updating the patch to fix that one issue.
Here is what I am getting these:
$ make SANITIZE=address CC=clang && cd t && sh t7527-*.sh -i -v
Note that "-i" is to say "stop at the first one".
expecting success of 7527.12 'create some files':
test_when_finished clean_up_repo_and_stop_daemon &&
start_daemon --tf "$PWD/.git/trace" &&
create_files &&
test-tool fsmonitor-client query --token 0 &&
grep "^event: dir1/new$" .git/trace &&
grep "^event: dir2/new$" .git/trace &&
grep "^event: new$" .git/trace
fsmonitor-daemon is watching '/home/gitster/w/git.git/t/trash directory.t7527-builtin-fsmonitor'
builtin:0.1039108.20251230T123036.129805Z:0/event: dir1/new
event: dir1/new
event: dir2/new
event: dir2/new
event: new
event: new
HEAD is now at 1d1edcb initial
Removing dir1/new
Removing dir2/new
Removing new
not ok 12 - create some files
#
# test_when_finished clean_up_repo_and_stop_daemon &&
#
# start_daemon --tf "$PWD/.git/trace" &&
#
# create_files &&
#
# test-tool fsmonitor-client query --token 0 &&
#
# grep "^event: dir1/new$" .git/trace &&
# grep "^event: dir2/new$" .git/trace &&
# grep "^event: new$" .git/trace
#
1..12
=================================================================
==git==1039073==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x55c18d8d4042 in calloc (git+0x8c042) (BuildId: 4097db008a82663ae0b3398128a7ab4e09bbdd21)
#1 0x55c18dc10f14 in xcalloc wrapper.c:154:8
#2 0x55c18d945f72 in kh_init_str builtin/fsmonitor--daemon.c:656:1
#3 0x55c18d945828 in do_handle_client builtin/fsmonitor--daemon.c:871:10
#4 0x55c18d945191 in handle_client builtin/fsmonitor--daemon.c:987:11
#5 0x55c18dc283e2 in worker_thread__do_io compat/simple-ipc/ipc-unix-socket.c:532:9
#6 0x55c18dc27a7f in worker_thread_proc compat/simple-ipc/ipc-unix-socket.c:606:9
#7 0x55c18d8d64f4 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o
#8 0x7fe358257b7a in start_thread nptl/pthread_create.c:448:8
#9 0x7fe3582d57b7 in __GI___clone3 misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 |
|
Sorry for the noise. Is your run with my v3 code? That one had a green CI
run. There was an existing leak that I sent as another patch as well as
bundled into this as v3 as I’m not sure of your procedure for stacked
patches.
…On Tue, Dec 30, 2025 at 5:41 AM gitgitgadget-git[bot] < ***@***.***> wrote:
*gitgitgadget-git[bot]* left a comment (git/git#2147)
<#2147 (comment)>
On the Git mailing list
***@***.***>, Junio C Hamano
wrote (reply to this <https://gitgitgadget.github.io/reply-to-this>):
Thanks for a quick turnaround, but it would be more efficient if you
hunted all the leaks yourself, instead of getting a report for one
issue and updating the patch to fix that one issue.
Here is what I am getting these:
$ make SANITIZE=address CC=clang && cd t && sh t7527-*.sh -i -v
Note that "-i" is to say "stop at the first one".
expecting success of 7527.12 'create some files':
test_when_finished clean_up_repo_and_stop_daemon &&
start_daemon --tf "$PWD/.git/trace" &&
create_files &&
test-tool fsmonitor-client query --token 0 &&
grep "^event: dir1/new$" .git/trace &&
grep "^event: dir2/new$" .git/trace &&
grep "^event: new$" .git/trace
fsmonitor-daemon is watching '/home/gitster/w/git.git/t/trash directory.t7527-builtin-fsmonitor'builtin:0.1039108.20251230T123036.129805Z:0/event: dir1/newevent: dir1/newevent: dir2/newevent: dir2/newevent: newevent: new
HEAD is now at 1d1edcb initial
Removing dir1/new
Removing dir2/new
Removing new
not ok 12 - create some files
#
# test_when_finished clean_up_repo_and_stop_daemon &&
#
# start_daemon --tf "$PWD/.git/trace" &&
#
# create_files &&
#
# test-tool fsmonitor-client query --token 0 &&
#
# grep "^event: dir1/new$" .git/trace &&
# grep "^event: dir2/new$" .git/trace &&
# grep "^event: new$" .git/trace
#
1..12
=================================================================
==git==1039073==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x55c18d8d4042 in calloc (git+0x8c042) (BuildId: 4097db008a82663ae0b3398128a7ab4e09bbdd21)
#1 0x55c18dc10f14 in xcalloc wrapper.c:154:8
#2 0x55c18d945f72 in kh_init_str builtin/fsmonitor--daemon.c:656:1
#3 0x55c18d945828 in do_handle_client builtin/fsmonitor--daemon.c:871:10
#4 0x55c18d945191 in handle_client builtin/fsmonitor--daemon.c:987:11
#5 0x55c18dc283e2 in worker_thread__do_io compat/simple-ipc/ipc-unix-socket.c:532:9
#6 0x55c18dc27a7f in worker_thread_proc compat/simple-ipc/ipc-unix-socket.c:606:9
#7 0x55c18d8d64f4 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o
#8 0x7fe358257b7a in start_thread nptl/pthread_create.c:448:8
#9 0x7fe3582d57b7 in __GI___clone3 misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
—
Reply to this email directly, view it on GitHub
<#2147 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAJZT5PP5JDZU2EFF27U4T4EKMJ7AVCNFSM6AAAAACQJJ6JQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTMOJZG42DIOJXGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Two patches with suggested improvements: 1. Documentation and minor fixes: - Document why V9FS is excluded from is_remote_fs() (supports inotify) - Replace magic number 1000 with STALE_RENAME_CHECK_INTERVAL_SEC constant - Document buffer size and timeout rationale 2. Linux-specific tests: - Test fsmonitor.socketdir configuration - Test directory rename tracking via inotify cookies - Test cross-directory moves - Test rapid nested directory creation
207da68 to
c7fd346
Compare
|
/submit |
|
Submitted as pull.2147.v4.git.git.1767202894884.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
c7fd346 to
350c90b
Compare
|
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Wed, Dec 31, 2025 at 05:41:34PM +0000, Paul Tarjan via GitGitGadget wrote:
> From: Paul Tarjan <github@paulisageek.com>
>
> Implement fsmonitor for Linux using the inotify API, bringing it to
> feature parity with existing Windows and macOS implementations.
>
> The Linux implementation uses inotify to monitor filesystem events.
> Unlike macOS's FSEvents which can watch a single root directory,
> inotify requires registering watches on every directory of interest.
> The implementation carefully handles directory renames and moves
> using inotify's cookie mechanism to track IN_MOVED_FROM/IN_MOVED_TO
> event pairs.
>
> Key implementation details:
> - Uses inotify_init1(O_NONBLOCK) for non-blocking event monitoring
> - Maintains bidirectional hashmaps between watch descriptors and paths
> for efficient event processing
> - Handles directory creation, deletion, and renames dynamically
> - Detects remote filesystems (NFS, CIFS, SMB, etc.) via statfs()
> - Falls back to $HOME/.git-fsmonitor-* for socket when .git is remote
> - Creates batches lazily (only for actual file events, not cookies)
> to avoid spurious sequence number increments
>
> Build configuration:
> - Enabled via FSMONITOR_DAEMON_BACKEND=linux and FSMONITOR_OS_SETTINGS=linux
> - Requires NO_PTHREADS and NO_UNIX_SOCKETS to be unset
> - Adds HAVE_LINUX_MAGIC_H for filesystem type detection
This would also need the below patch to support Meson. Would be great if
you include it, otherwise I can send it as a separate patch once this
topic lands. Thanks!
Patrick
-- >8 --
diff --git a/meson.build b/meson.build
index dd52efd1c8..0130d40702 100644
--- a/meson.build
+++ b/meson.build
@@ -1322,6 +1322,9 @@ endif
fsmonitor_backend = ''
if host_machine.system() == 'windows'
fsmonitor_backend = 'win32'
+elif host_machine.system() == 'linux' and threads.found() and compiler.has_header('linux/magic.h')
+ fsmonitor_backend = 'linux'
+ libgit_c_args += '-DHAVE_LINUX_MAGIC_H'
elif host_machine.system() == 'darwin'
fsmonitor_backend = 'darwin'
libgit_dependencies += dependency('CoreServices') |
|
User |
ffcf5e6 to
41a1114
Compare
Add a pthread_cond_timedwait() implementation to the Windows pthread compatibility layer using SleepConditionVariableCS() with a millisecond timeout computed from the absolute deadline. This enables callers to use bounded waits on condition variables instead of blocking indefinitely with pthread_cond_wait(). Signed-off-by: Paul Tarjan <github@paulisageek.com>
6873623 to
bb438af
Compare
|
/submit |
|
On Tue, Feb 24, 2026 at 2:03 AM gitgitgadget-git[bot] < ***@***.***> wrote:
*gitgitgadget-git[bot]* left a comment (git/git#2147)
<#2147 (comment)>
*Patrick Steinhardt* wrote on the Git mailing list
***@***.***> (how to reply to
this email <https://gitgitgadget.github.io/reply-to-this>):
Thanks for the thorough review. v6 is a 10-patch series now.
I think other than stating that this uses inotify, what this commit
message should also explain is why it was chosen over its alternatives
like fanotify.
Yeah, good call. fanotify needs CAP_SYS_ADMIN or CAP_PERFMON so it's a
non-starter for an unprivileged daemon. Added that.
It would be great to avoid all these bulleted lists here and instead
have some paragraphs that explain the design decisions.
Rewrote.
The issues that you've addressed don't typically belong into the commit
message. They are useful context when part of a cover letter, but are
less useful in the commit messages themselves.
Moved to cover letter.
In that case it might make sense to say "Based-on-patch-by:" for both
of these authors.
Done.
These feel like while-at-it memory leak fixes. They should probably be
moved into separate commits with a proper explanation.
Split out as [1/10] and [2/10].
This is (almost) the exact same implementation as we have on macOS.
We should probably deduplicate the logic.
Yep. [5/10] merges them into fsm-ipc-unix.c. The unified version picks up
the worktree NULL BUG() guard that was only in the Linux version.
This file is again an almost exact copy of what we have in
"fsm-settings-darwin.c", and as far as I can see there isn't even
anything specific to either of the systems here. So we should probably
deduplicate the logic.
[6/10] merges them into fsm-settings-unix.c. Kept the xstrdup()+dirname()
approach from macOS and the "vfat" check.
From these we should really only require <sys/inotify.h>, as all
others are included via "compat/posix.h"
Removed. Also dropped the redundant <errno.h>/<stdio.h>/<string.h> from
fsm-path-utils-linux.c while I was at it.
Error messages should start with a lower-case letter.
Fixed.
Is it intended that the polling timeout is 1000 seconds, or ~16
minutes? If so, it feels like something that might warrant a comment.
Nope, just a bug. Both upstream PRs had 1000 too so I inherited it without
looking closely. The stale rename check is now 1 second. I also bumped the
poll() timeout from 1ms to 50ms to reduce idle wake-ups.
Hm. In case the watch already exists, can't it be the case that we
have raced e.g. during a rename? For example, the old watch hasn't
been removed yet, but we already try to create the new watch. If we
then see EEXIST we wouldn't update the hash map and thus keep the old
path intact.
Or is the order of inotify events guaranteed?
Renames can't hit EEXIST because rename_dir() does remove_watch() before
add_watch(), so the old watch is gone by the time we add the new one.
Where EEXIST actually comes up is the recursive scan overlap. Say we get
IN_CREATE for a new directory: we add_watch() it and then
register_inotify() recursively scans its subdirectories. Meanwhile the
kernel has queued IN_CREATE events for those same subdirs. When we get back
to processing those queued events, we try add_watch() on directories that
register_inotify() already took care of. With IN_MASK_CREATE the duplicate
gets EEXIST, which is fine since the watch is already set up correctly.
Is it correct to conflate ENOENT and ENOTDIR here? In the first case
the directory was deleted, sure. But in the second case the directory
might have turned into a file due to renames, so don't we have to
treat it a bit differently?
Either way it's not a directory we can watch, and the parent's inotify
watch already delivered the relevant events (IN_MOVED_FROM/IN_MOVED_TO or
IN_DELETE/IN_CREATE). We're just doing a recursive scan here, so if the
path is gone or turned into a file, there's nothing left to scan.
We don't use fstatat(3p) in our codebase yet, also because it's not
easily portable to Windows. But I wonder whether we should use it here
to be safer against races.
Yeah it'd be marginally safer, but since it's not used elsewhere and the
parent watch catches any replacement anyway, I'd rather not introduce it
here. Happy to revisit if it becomes a real problem.
Curious. Don't we have to unregister the watcher in case a directory
was deleted?
We do, just not in that code path. The kernel sends IN_DELETE to the parent
watch (that's what we see in process_event) and separately sends
IN_DELETE_SELF to the directory's own watch. handle_events() catches
IN_DELETE_SELF via em_remove_watch() → remove_watch(). So the break in
process_event just records the path change; the actual cleanup happens when
IN_DELETE_SELF arrives.
Can it happen that events arrive out-of-order so that we have some
events queued up that would touch the same path? In such a case we
might silently ignore such queued events.
Events for a given wd come in order within a read() buffer. If we remove a
watch and then see stale events for that wd in a later read(), the hashmap
lookup returns NULL and we skip them. That's fine since the path change was
already reported when we processed IN_DELETE_SELF.
Do we need to use a monotonic clock here to ensure that there cannot
be any backwards jumps in time, e.g. when switching from summer to
winter time?
time(NULL) is UTC so DST doesn't matter. An NTP step could cause a small
jump but the worst case is the stale rename check fires a bit early or
late, which is harmless. The rest of fsmonitor uses time(NULL) for batch
pinning too so this stays consistent.
Do we have to synchronize access to `state->listen_data->shutdown`?
As far as I can see it's being set by the fs event listener.
It's written by the listener thread and checked in that same thread's poll
loop. The only other writer is fsm_listen__stop_async() from the main
thread, and it only ever transitions from CONTINUE → terminal. Worst case
of a torn read is one extra poll iteration. The macOS and Windows backends
do the same.
In which cases do we need to have this fallback for statfs? This
syscall exists in Linux since the 90s, so shouldn't we be able to
assume that we can use it? Or are there specific error cases that we
need to worry about here?
It's not a fallback for statfs. statfs() always works and gives us the fs
magic number for the is_remote check. find_mount() reads /proc/mounts to
get a human-readable fstype string ("nfs4" instead of 0x6969) for trace
logging. If /proc/mounts isn't available we just fall back to a
magic-number-to-name lookup table.
It's a bit unclear why specifically this test was added. Does it catch
an edge case that you have discovered? Might make sense to also add it
in a preparatory commit so that we can get a bit of context.
Yeah, this caught an actual crash in v3 (the EEXIST race I described
above). Moved to its own commit [8/10] with the context in the message.
Four new patches in this version:
[3/10] and [4/10]: pthread_cond_timedwait. [3/10] adds
pthread_cond_timedwait to the Windows pthread compat layer, [4/10] uses it
to replace the infinite pthread_cond_wait() in the cookie-wait path with a
1-second timeout. On overlay filesystems (common in containers) inotify
watches succeed but events never arrive, so without this the daemon blocks
forever waiting for the cookie event.
[9/10] and [10/10]: fd cleanup and daemon detach. [9/10] adds a
close_fd_above_stderr option to child_process that closes fds 3+ after fork
(uses sysconf(_SC_OPEN_MAX) capped at 4096). [10/10] uses it in
spawn_daemon() and also calls setsid() on --detach so shells with job
control don't wait on the daemon's process group. This is what was causing
the CI hang on Fedora. stop_git() also had a bug where `ps` wasn't
available (no procps-ng), leaving git_pgid empty, and bash in sh mode
silently accepts `kill -0 -- -` so the loop ran forever.
…
Message ID: ***@***.***>
|
|
Submitted as pull.2147.v6.git.git.1772050636.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
| @@ -671,7 +671,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state, | ||
| const struct fsmonitor_batch *batch; |
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Paul Tarjan <github@paulisageek.com>
>
> The `shown` kh_str_t was freed with kh_release_str() at a point in
> the code only reachable in the non-trivial response path. When the
> client receives a trivial response, the code jumps to the `cleanup`
> label, skipping the kh_release_str() call entirely and leaking the
> hash table.
>
> Fix this by initializing `shown` to NULL and moving the cleanup to the
> `cleanup` label using kh_destroy_str(), which is safe to call on NULL.
> This ensures the hash table is freed regardless of which code path is
> taken.
Makes perfect sense, the changes to the code matches the
description, and the difference between kh_release_* and
kh_destroy_* in khash.h is exactly as described in the log message.
Perfect. I wish all the patches posted here are as easy to review
as this one ;-)
> Signed-off-by: Paul Tarjan <github@paulisageek.com>
> ---
> builtin/fsmonitor--daemon.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
> index 242c594646..bc4571938c 100644
> --- a/builtin/fsmonitor--daemon.c
> +++ b/builtin/fsmonitor--daemon.c
> @@ -671,7 +671,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
> const struct fsmonitor_batch *batch;
> struct fsmonitor_batch *remainder = NULL;
> intmax_t count = 0, duplicates = 0;
> - kh_str_t *shown;
> + kh_str_t *shown = NULL;
> int hash_ret;
> int do_trivial = 0;
> int do_flush = 0;
> @@ -909,8 +909,6 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
> total_response_len += payload.len;
> }
>
> - kh_release_str(shown);
> -
> pthread_mutex_lock(&state->main_lock);
>
> if (token_data->client_ref_count > 0)
> @@ -954,6 +952,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
> trace2_data_intmax("fsmonitor", the_repository, "response/count/duplicates", duplicates);
>
> cleanup:
> + kh_destroy_str(shown);
> strbuf_release(&response_token);
> strbuf_release(&requested_token_id);
> strbuf_release(&payload);| @@ -197,20 +197,31 @@ static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie( | ||
| unlink(cookie_pathname.buf); |
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Paul Tarjan <github@paulisageek.com>
>
> The cookie wait in with_lock__wait_for_cookie() uses an infinite
> pthread_cond_wait() loop. The existing comment notes the desire
> to switch to pthread_cond_timedwait(), but the routine was not
> available in git thread-utils.
>
> On certain container or overlay filesystems, inotify watches may
> succeed but events are never delivered. In this case the daemon
> would hang indefinitely waiting for the cookie event, which in
> turn causes the client to hang.
>
> Replace the infinite wait with a one-second timeout using
> pthread_cond_timedwait(). If the timeout fires, report an
> error and let the client proceed with a trivial (full-scan)
> response rather than blocking forever.
I cannot convince myself if one-second interval is not too frequent
to force everybody, including those with working inotify, to poll.
I wonder if this is something that may want to be configurable (or
better yet, auto-detectable, but that may be wishing for moon).
> Signed-off-by: Paul Tarjan <github@paulisageek.com>
> ---
> builtin/fsmonitor--daemon.c | 37 ++++++++++++++++++++++++-------------
> 1 file changed, 24 insertions(+), 13 deletions(-)
>
> diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
> index 4d52622e24..f6c406ff12 100644
> --- a/builtin/fsmonitor--daemon.c
> +++ b/builtin/fsmonitor--daemon.c
> @@ -197,20 +197,31 @@ static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie(
> unlink(cookie_pathname.buf);
>
> /*
> - * Technically, this is an infinite wait (well, unless another
> - * thread sends us an abort). I'd like to change this to
> - * use `pthread_cond_timedwait()` and return an error/timeout
> - * and let the caller do the trivial response thing, but we
> - * don't have that routine in our thread-utils.
> - *
> - * After extensive beta testing I'm not really worried about
> - * this. Also note that the above open() and unlink() calls
> - * will cause at least two FS events on that path, so the odds
> - * of getting stuck are pretty slim.
> + * Wait for the listener thread to see the cookie file.
> + * Time out after a short interval so that the client
> + * does not hang forever if the filesystem does not deliver
> + * events (e.g., on certain container/overlay filesystems
> + * where inotify watches succeed but events never arrive).
> */
> - while (cookie->result == FCIR_INIT)
> - pthread_cond_wait(&state->cookies_cond,
> - &state->main_lock);
> + {
> + struct timeval now;
> + struct timespec ts;
> + int err = 0;
> +
> + gettimeofday(&now, NULL);
> + ts.tv_sec = now.tv_sec + 1;
> + ts.tv_nsec = now.tv_usec * 1000;
> +
> + while (cookie->result == FCIR_INIT && !err)
> + err = pthread_cond_timedwait(&state->cookies_cond,
> + &state->main_lock,
> + &ts);
> + if (err == ETIMEDOUT && cookie->result == FCIR_INIT) {
> + trace_printf_key(&trace_fsmonitor,
> + "cookie_wait timed out");
> + cookie->result = FCIR_ERROR;
> + }
> + }
>
> done:
> hashmap_remove(&state->cookies, &cookie->entry, NULL);There was a problem hiding this comment.
The 1-second timeout only fires when the filesystem fails to deliver the cookie event at all (e.g. overlayfs in containers where inotify watches succeed but events never arrive). On a working filesystem the cookie event comes back in well under a millisecond, so the timeout never triggers. When it does fire, the client falls back to a full scan, which is the safe default. Happy to make it configurable if you think that's worth it, but the current behavior seemed reasonable as a starting point.
There was a problem hiding this comment.
@ptarjan please note that gitgitgadget/gitgitgadget#154 is not implemented, and therefore all of your replies here in the PR are not reaching their intended audience. You will need to follow the guidance in the "reply to this" links.
There was a problem hiding this comment.
Thanks for the heads up. I was mostly doing this for self tracking to make sure I addressed them all. Is the changelog in the PR body enough of a reply or should I forward all these notes to the list by hand?
There was a problem hiding this comment.
Is the changelog in the PR body enough of a reply or should I forward all these notes to the list by hand?
It's up to you. In the past I have skipped the personal replies when under enormous time pressure. I feel that this did not go over well, even if I did not have much choice on the matter.
There was a problem hiding this comment.
Paul Tarjan wrote on the Git mailing list (how to reply to this email):
Junio C Hamano <gitster@pobox.com> writes:
> I cannot convince myself if one-second interval is not too frequent
> to force everybody, including those with working inotify, to poll.
> I wonder if this is something that may want to be configurable (or
> better yet, auto-detectable, but that may be wishing for moon).
The 1-second timeout only fires when the filesystem fails to deliver
the cookie event at all (e.g. overlayfs in containers where inotify
watches succeed but events never arrive). On a working filesystem
the cookie event comes back in well under a millisecond, so the
timeout never triggers. When it does fire, the client falls back to
a full scan, which is the safe default. Happy to make it
configurable if you think that's worth it, but the current behavior
seemed reasonable as a starting point.
Thanks,
Paul| @@ -197,20 +197,31 @@ static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie( | ||
| unlink(cookie_pathname.buf); |
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com> writes:
> + * Wait for the listener thread to see the cookie file.
Is this a complete sentence? Or perhaps something like "see" ->
"check"?There was a problem hiding this comment.
Fixed, changed to "observe".
There was a problem hiding this comment.
Paul Tarjan wrote on the Git mailing list (how to reply to this email):
Junio C Hamano <gitster@pobox.com> writes:
> Is this a complete sentence? Or perhaps something like "see" ->
> "check"?
Fixed, changed to "observe".
Thanks,
Paul| @@ -2365,7 +2365,11 @@ ifdef FSMONITOR_DAEMON_BACKEND | ||
| COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND |
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Paul Tarjan <github@paulisageek.com>
>
> The IPC path logic for determining the Unix domain socket location is
> nearly identical between macOS and Linux. Both need to check whether
> the .git directory is on a remote filesystem and, if so, fall back to
> a socket path under $HOME or a user-configured directory.
>
> Merge the two implementations into a single fsm-ipc-unix.c that is
> shared by both platforms. The unified version includes the worktree
> NULL check (BUG guard) from the Linux implementation, which was missing
> in the macOS version.
>
> Update Makefile, meson.build, and CMakeLists.txt to use the new shared
> file for non-Windows platforms.
This sounds as if the patch started with two IPC path logic for
macOS and Linux, and the patch removes one of them and updates the
other one so that both platforms can use the surviving one.
But the patch seems to indicate somewhat different story. The code
before this patch started with a single macOS (darwin) one, but
because it is mostly applicable to other UNIX variants as well, the
patch renames the existing macOS one for unix and makes a small
adjustment (namely, asserts that r->worktree is not NULL).
Perhaps you started with two separate implementations (possibly with
a new one called UNIX that was added by largely copying and pasting
from the macOS one) and unified them, but that history does not
exist in this 10-patch series, so the above story would need to be
rewritten to say what actually is happening in this series, e.g., we
realized macOS one is applicable generally for UNIX variants so we
are renaming it from darwin to unix, or something.
> @@ -2365,7 +2365,11 @@ ifdef FSMONITOR_DAEMON_BACKEND
> COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
> COMPAT_OBJS += compat/fsmonitor/fsm-listen-$(FSMONITOR_DAEMON_BACKEND).o
> COMPAT_OBJS += compat/fsmonitor/fsm-health-$(FSMONITOR_DAEMON_BACKEND).o
> - COMPAT_OBJS += compat/fsmonitor/fsm-ipc-$(FSMONITOR_DAEMON_BACKEND).o
> +ifeq ($(FSMONITOR_DAEMON_BACKEND),win32)
> + COMPAT_OBJS += compat/fsmonitor/fsm-ipc-win32.o
> +else
> + COMPAT_OBJS += compat/fsmonitor/fsm-ipc-unix.o
> +endif
Makes me wonder if doing
#define FSMONITOR_DAEMON_BACKEND unix
for macOS and then keeping
COMPAT_OBJS += compat/fsmonitor/fsm-ipc-$(FSMONITOR_DAEMON_BACKEND).o
as-is would be cleaner.
Later, we can add the same "#define FSMONITOR_DAEMON_BACKEND unix"
for Linux, perhaps?
This is doubly true when we look at the build recipe changes to the
meson one below ...
> diff --git a/meson.build b/meson.build
> index dd52efd1c8..8de795f9d4 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1332,11 +1332,16 @@ if fsmonitor_backend != ''
>
> libgit_sources += [
> 'compat/fsmonitor/fsm-health-' + fsmonitor_backend + '.c',
> - 'compat/fsmonitor/fsm-ipc-' + fsmonitor_backend + '.c',
> 'compat/fsmonitor/fsm-listen-' + fsmonitor_backend + '.c',
> 'compat/fsmonitor/fsm-path-utils-' + fsmonitor_backend + '.c',
> 'compat/fsmonitor/fsm-settings-' + fsmonitor_backend + '.c',
> ]
> +
> + if fsmonitor_backend == 'win32'
> + libgit_sources += 'compat/fsmonitor/fsm-ipc-win32.c'
> + else
> + libgit_sources += 'compat/fsmonitor/fsm-ipc-unix.c'
> + endif
> endif
... which would not be needed if fsmonitor_backend is set to 'unix'
instead of 'darwin'.There was a problem hiding this comment.
Good point. Restructured: FSMONITOR_OS_SETTINGS is now set to "unix" for both macOS and Linux (and "win32" for Windows). The build files use
There was a problem hiding this comment.
Paul Tarjan wrote on the Git mailing list (how to reply to this email):
Junio C Hamano <gitster@pobox.com> writes:
> But the patch seems to indicate somewhat different story. The code
> before this patch started with a single macOS (darwin) one, but
> because it is mostly applicable to other UNIX variants as well, the
> patch renames the existing macOS one for unix and makes a small
> adjustment (namely, asserts that r->worktree is not NULL).
Good point. Restructured: FSMONITOR_OS_SETTINGS is now set to "unix"
for both macOS and Linux (and "win32" for Windows). The build files
use $(FSMONITOR_OS_SETTINGS) for fsm-ipc and fsm-settings, and
$(FSMONITOR_DAEMON_BACKEND) for the platform-specific files (listen,
health, path-utils). No more if/else blocks.
Also rewrote both commit messages to describe what's actually
happening (renaming darwin to unix) rather than "merging two
implementations".
Thanks,
Paul| COMPAT_OBJS += compat/fsmonitor/fsm-ipc-unix.o | ||
| endif | ||
| endif | ||
|
|
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Paul Tarjan <github@paulisageek.com>
>
> The fsmonitor settings logic for checking Unix domain socket
> compatibility is nearly identical between macOS and Linux. Both check
> whether the IPC socket directory resides on a remote filesystem or a
> filesystem that does not support Unix domain sockets (NTFS, FAT32, etc).
>
> Merge the two implementations into a single fsm-settings-unix.c shared
> by both platforms. The unified version uses the safer xstrdup() +
> dirname() approach from the macOS implementation (avoiding strbuf
> mutation with dirname()) and includes the "vfat" filesystem check.
>
> Update Makefile, meson.build, and CMakeLists.txt to use the new shared
> file for non-Windows platforms.
I guess exactly the same comment as the one for 05/10 applies here
as well?
There was a problem hiding this comment.
Yep, same fix applied here. Also rewrote both commit messages to describe what's actually happening (renaming darwin to unix) rather than "merging two implementations".
There was a problem hiding this comment.
Paul Tarjan wrote on the Git mailing list (how to reply to this email):
Junio C Hamano <gitster@pobox.com> writes:
> I guess exactly the same comment as the one for 05/10 applies here
> as well?
Yep, same fix applied here. Also rewrote both commit messages to
describe what's actually happening (renaming darwin to unix) rather
than "merging two implementations".
Thanks,
Paul| @@ -832,6 +832,17 @@ int start_command(struct child_process *cmd) | ||
| child_close(cmd->out); |
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Paul Tarjan <github@paulisageek.com>
>
> Add a new option to struct child_process that closes file descriptors
> 3 and above in the child after forking but before exec. This prevents
> long-running child processes from inheriting pipe endpoints or other
> descriptors from the parent environment.
>
> The upper bound for the fd scan comes from sysconf(_SC_OPEN_MAX),
> capped at 4096 to avoid excessive iteration when the limit is set
> very high.
>
> Signed-off-by: Paul Tarjan <github@paulisageek.com>
> ---
> run-command.c | 11 +++++++++++
> run-command.h | 9 +++++++++
> 2 files changed, 20 insertions(+)
All makes sense.
I somehow find it a bit surprising that we did not already have this
feature anywhere, as closing all except for the low file descriptors
connected to stdio by default is fairly a common thing to do.
> diff --git a/run-command.c b/run-command.c
> index e3e02475cc..cbadcf5ff8 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -832,6 +832,17 @@ fail_pipe:
> child_close(cmd->out);
> }
>
> + if (cmd->close_fd_above_stderr) {
> + long max_fd = sysconf(_SC_OPEN_MAX);
> + int fd;
> + if (max_fd < 0 || max_fd > 4096)
> + max_fd = 4096;
> + for (fd = 3; fd < max_fd; fd++) {
> + if (fd != child_notifier)
> + close(fd);
> + }
> + }
> +
> if (cmd->dir && chdir(cmd->dir))
> child_die(CHILD_ERR_CHDIR);
>
> diff --git a/run-command.h b/run-command.h
> index 0df25e445f..a1aa1b1069 100644
> --- a/run-command.h
> +++ b/run-command.h
> @@ -141,6 +141,15 @@ struct child_process {
> unsigned stdout_to_stderr:1;
> unsigned clean_on_exit:1;
> unsigned wait_after_clean:1;
> +
> + /**
> + * Close file descriptors 3 and above in the child after forking
> + * but before exec. This prevents the long-running child from
> + * inheriting pipe endpoints or other descriptors from the parent
> + * environment (e.g., the test harness).
> + */
> + unsigned close_fd_above_stderr:1;
> +
> void (*clean_on_exit_handler)(struct child_process *process);
> };The cookie wait in with_lock__wait_for_cookie() uses an infinite pthread_cond_wait() loop. The existing comment notes the desire to switch to pthread_cond_timedwait(), but the routine was not available in git thread-utils. On certain container or overlay filesystems, inotify watches may succeed but events are never delivered. In this case the daemon would hang indefinitely waiting for the cookie event, which in turn causes the client to hang. Replace the infinite wait with a one-second timeout using pthread_cond_timedwait(). If the timeout fires, report an error and let the client proceed with a trivial (full-scan) response rather than blocking forever. Signed-off-by: Paul Tarjan <github@paulisageek.com>
The macOS fsm-ipc-darwin.c is applicable to other Unix variants as well. Rename it to fsm-ipc-unix.c and add a worktree NULL check (BUG guard) that was missing from the macOS version. To support this, introduce FSMONITOR_OS_SETTINGS which is set to "unix" for both macOS and Linux, distinct from FSMONITOR_DAEMON_BACKEND which remains platform-specific (darwin, linux, win32). Move fsm-path-utils from FSMONITOR_OS_SETTINGS to FSMONITOR_DAEMON_BACKEND since the path-utils files are platform-specific. Signed-off-by: Paul Tarjan <github@paulisageek.com>
The macOS fsm-settings-darwin.c is applicable to other Unix variants as well. Rename it to fsm-settings-unix.c, using the safer xstrdup()+dirname() approach and including the "vfat" filesystem check. Now that both fsm-ipc and fsm-settings use the "unix" variant name, set FSMONITOR_OS_SETTINGS to "unix" for macOS in config.mak.uname and remove the if/else conditionals from the build files. Signed-off-by: Paul Tarjan <github@paulisageek.com>
Implement the built-in fsmonitor daemon for Linux using the inotify API, bringing it to feature parity with the existing Windows and macOS implementations. The implementation uses inotify rather than fanotify because fanotify requires either CAP_SYS_ADMIN or CAP_PERFMON capabilities, making it unsuitable for an unprivileged user-space daemon. While inotify has the limitation of requiring a separate watch on every directory (unlike macOS's FSEvents, which can monitor an entire directory tree with a single watch), it operates without elevated privileges and provides the per-file event granularity needed for fsmonitor. The listener uses inotify_init1(O_NONBLOCK) with a poll loop that checks for events with a 50-millisecond timeout, keeping the inotify queue well-drained to minimize the risk of overflows. Bidirectional hashmaps map between watch descriptors and directory paths for efficient event resolution. Directory renames are tracked using inotify's cookie mechanism to correlate IN_MOVED_FROM and IN_MOVED_TO event pairs; a periodic check detects stale renames where the matching IN_MOVED_TO never arrived, forcing a resync. New directory creation triggers recursive watch registration to ensure all subdirectories are monitored. The IN_MASK_CREATE flag is used where available to prevent modifying existing watches, with a fallback for older kernels. When IN_MASK_CREATE is available and inotify_add_watch returns EEXIST, it means another thread or recursive scan has already registered the watch, so it is safe to ignore. Remote filesystem detection uses statfs() to identify network-mounted filesystems (NFS, CIFS, SMB, FUSE, etc.) via their magic numbers. Mount point information is read from /proc/mounts and matched against the statfs f_fsid to get accurate, human-readable filesystem type names for logging. When the .git directory is on a remote filesystem, the IPC socket falls back to $HOME or a user-configured directory via the fsmonitor.socketDir setting. Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com> Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com> Signed-off-by: Paul Tarjan <github@paulisageek.com>
Add a smoke test that verifies the filesystem actually delivers inotify events to the daemon. On some configurations (e.g., overlayfs with older kernels), inotify watches succeed but events are never delivered. The daemon cookie wait will time out, but every subsequent test would fail. Skip the entire test file early when this is detected. Add a test that exercises rapid nested directory creation to verify the daemon correctly handles the EEXIST race between recursive scan and queued inotify events. When IN_MASK_CREATE is available and a directory watch is added during recursive registration, the kernel may also deliver a queued IN_CREATE event for the same directory. The second inotify_add_watch() returns EEXIST, which must be treated as harmless. An earlier version of the listener crashed in this scenario. Signed-off-by: Paul Tarjan <github@paulisageek.com>
Add a new option to struct child_process that closes file descriptors 3 and above in the child after forking but before exec. This prevents long-running child processes from inheriting pipe endpoints or other descriptors from the parent environment. The upper bound for the fd scan comes from sysconf(_SC_OPEN_MAX), capped at 4096 to avoid excessive iteration when the limit is set very high. Signed-off-by: Paul Tarjan <github@paulisageek.com>
When the fsmonitor daemon is spawned as a background process, it may inherit file descriptors from its parent that it does not need. In particular, when the test harness or a CI system captures output through pipes, the daemon can inherit duplicated pipe endpoints. If the daemon holds these open, the parent process never sees EOF and may appear to hang. Set close_fd_above_stderr on the child process at daemon startup so that file descriptors 3 and above are closed before any daemon work begins. This ensures the daemon does not inadvertently hold open descriptors from its launching environment. Additionally, call setsid() when the daemon starts with --detach to create a new session and process group. Without this, shells that enable job control (e.g. bash with "set -m") treat the daemon as part of the spawning command's job. Their "wait" builtin then blocks until the daemon exits, which it never does. This specifically affects systems where /bin/sh is bash (e.g. Fedora), since dash only waits for the specific PID rather than the full process group. Add a 30-second timeout to "fsmonitor--daemon stop" so it does not block indefinitely if the daemon fails to shut down. Signed-off-by: Paul Tarjan <github@paulisageek.com>
bb438af to
4987a00
Compare
|
/submit |
|
Submitted as pull.2147.v7.git.git.1772065643.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
There was a status update in the "Cooking" section about the branch The fsmonitor daemon has been implemented for Linux. Comments? source: <pull.2147.v6.git.git.1772050636.gitgitgadget@gmail.com> |
|
Junio C Hamano wrote on the Git mailing list (how to reply to this email): "Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Changes since v6:
>
> * Introduced FSMONITOR_OS_SETTINGS build variable (set to "unix" for macOS
> and Linux, "win32" for Windows) to eliminate if/else conditionals in
> Makefile, meson.build, and CMakeLists.txt per Junio's review
> * Moved fsm-path-utils from FSMONITOR_OS_SETTINGS to
> FSMONITOR_DAEMON_BACKEND since path-utils files are platform-specific
> * Removed V9FS_MAGIC from remote filesystem detection (9p is used for local
> VM/container host mounts where fsmonitor works fine)
> * Removed redundant #include <libgen.h> (already provided by
> compat/posix.h)
> * Fixed cookie wait comment wording ("see" → "observe")
> * Rewrote commit messages for IPC and settings dedup patches
I saw nothing unexpected in this iteration relative to the previous
one. Looking good.
It would be nice to have another set of eyes on the inotify part of
this series before we mark it for 'next'.
Thanks. |
|
There was a status update in the "Cooking" section about the branch The fsmonitor daemon has been implemented for Linux. Will merge to 'next'? cf. <xmqq5x7j5xje.fsf@gitster.g> source: <pull.2147.v7.git.git.1772065643.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via 50e1203. |
|
User |
This series implements the built-in fsmonitor daemon for Linux using the inotify API, bringing it to feature parity with the existing Windows and macOS implementations. It also fixes two memory leaks in the platform-independent daemon code and deduplicates the IPC and settings logic that is now shared between macOS and Linux.
The implementation uses inotify rather than fanotify because fanotify requires either CAP_SYS_ADMIN or CAP_PERFMON capabilities, making it unsuitable for an unprivileged user-space daemon. While inotify has the limitation of requiring a separate watch on every directory (unlike macOS FSEvents, which can monitor an entire directory tree with a single watch), it operates without elevated privileges and provides the per-file event granularity needed for fsmonitor.
The listener uses inotify_init1(O_NONBLOCK) with a poll loop that checks for events with a 50-millisecond timeout, keeping the inotify queue well-drained to minimize the risk of overflows. Bidirectional hashmaps map between watch descriptors and directory paths for efficient event resolution. Directory renames are tracked using inotify cookie mechanism to correlate IN_MOVED_FROM and IN_MOVED_TO event pairs; a periodic check detects stale renames where the matching IN_MOVED_TO never arrived, forcing a resync.
New directory creation triggers recursive watch registration to ensure all subdirectories are monitored. The IN_MASK_CREATE flag is used where available to prevent modifying existing watches, with a fallback for older kernels. When IN_MASK_CREATE is available and inotify_add_watch returns EEXIST, it means another thread or recursive scan has already registered the watch, so it is safe to ignore.
Remote filesystem detection uses statfs() to identify network-mounted filesystems (NFS, CIFS, SMB, FUSE, etc.) via their magic numbers. Mount point information is read from /proc/mounts and matched against the statfs f_fsid to get accurate, human-readable filesystem type names for logging. When the .git directory is on a remote filesystem, the IPC socket falls back to $HOME or a user-configured directory via the fsmonitor.socketDir setting.
This series builds on work from #1352 by Eric DeCosta and #1667 by Marziyeh Esipreh, updated to work with the current codebase and address all review feedback.
Changes since v6:
Changes since v5:
Changes since v4:
Changes since v3:
Changes since v2:
Changes since v1:
CC: Patrick Steinhardt ps@pks.im
cc: Paul Tarjan paul@paultarjan.com