for-each-repo: work correctly in a worktree#2056
for-each-repo: work correctly in a worktree#2056derrickstolee wants to merge 2 commits intogitgitgadget:mastergitgitgadget/git:masterfrom derrickstolee:for-each-repo-in-gitdirderrickstolee/git:for-each-repo-in-gitdirCopy head branch name to clipboard
Conversation
|
/submit |
|
Submitted as pull.2056.git.1771903950.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
| @@ -1,7 +1,6 @@ | ||
| #define USE_THE_REPOSITORY_VARIABLE | ||
|
|
||
| #include "builtin.h" |
There was a problem hiding this comment.
Eric Sunshine wrote on the Git mailing list (how to reply to this email):
[Cc:+peff]
On Mon, Feb 23, 2026 at 10:32 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> When run in a worktree, the GIT_DIR directory is set in a different way
> than in a typical repository. Show this by updating t0068 to include a
> worktree and add a test that runs from that worktree. This requires
> moving the repo.key config into a global config instead of the base test
> repository's local config (demonstrating that it worked with
> non-worktree Git repositories).
>
> The fix is simple: unset the environment variable before looping over
> the repos.
>
> Signed-off-by: Derrick Stolee <stolee@gmail.com>
> ---
> diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
> @@ -60,6 +61,9 @@ int cmd_for_each_repo(int argc,
> + /* Be sure to not pass GIT_DIR to children. */
> + unsetenv(GIT_DIR_ENVIRONMENT);
This only unsets GIT_DIR. Is that sufficient in the general case?
Elsewhere, we recommend[*] unsetting all of Git's local environment
variables.
[*]: From the "githooks" man page: "Environment variables, such as
GIT_DIR, GIT_WORK_TREE, etc., are exported so that Git commands run by
the hook can correctly locate the repository. If your hook needs to
invoke Git commands in a foreign repository or in a different working
tree of the same repository, then it should clear these environment
variables so they do not interfere with Git operations at the foreign
location. For example: `unset $(git rev-parse --local-env-vars)`"There was a problem hiding this comment.
Jeff King wrote on the Git mailing list (how to reply to this email):
On Mon, Feb 23, 2026 at 10:34:30PM -0500, Eric Sunshine wrote:
> > diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
> > @@ -60,6 +61,9 @@ int cmd_for_each_repo(int argc,
> > + /* Be sure to not pass GIT_DIR to children. */
> > + unsetenv(GIT_DIR_ENVIRONMENT);
>
> This only unsets GIT_DIR. Is that sufficient in the general case?
> Elsewhere, we recommend[*] unsetting all of Git's local environment
> variables.
>
> [*]: From the "githooks" man page: "Environment variables, such as
> GIT_DIR, GIT_WORK_TREE, etc., are exported so that Git commands run by
> the hook can correctly locate the repository. If your hook needs to
> invoke Git commands in a foreign repository or in a different working
> tree of the same repository, then it should clear these environment
> variables so they do not interfere with Git operations at the foreign
> location. For example: `unset $(git rev-parse --local-env-vars)`"
Yeah, agreed. There's another subtle issue, which is that this is
unsetting GIT_DIR in the parent process. So any other code we call that
is meant to run in the original repo might get confused. I can well
believe there isn't any such code for a command like for-each-repo, but
as a general principle, the change should be made in the sub-process.
You can stick the elements of local_repo_env into the "env" list of the
child_process struct. If you grep around, you can find some instances of
this.
There's an open question there of how to handle config in the
environment, though. Depending on the sub-process, you may or may not
want such config to pass down to it. For for-each-repo, I'd guess that
you'd want:
git -c foo.bar=baz for-each-repo ...
to pass that foo.bar value. We do have a helper to handle that in
run-command.h:
/**
* Convenience function which prepares env for a command to be run in a
* new repo. This adds all GIT_* environment variables to env with the
* exception of GIT_CONFIG_PARAMETERS and GIT_CONFIG_COUNT (which cause the
* corresponding environment variables to be unset in the subprocess) and adds
* an environment variable pointing to new_git_dir. See local_repo_env in
* environment.h for more information.
*/
void prepare_other_repo_env(struct strvec *env, const char *new_git_dir);
Do be careful using it here, though. It expects to set GIT_DIR itself to
point to the new repo (which is passed in). But I'm not sure that's 100%
compatible with how for-each-repo works, which is using "git -C $repo"
under the hood, and letting the usual discovery happen.
So for a bare repo, you'd want to pass the repo directory. But for a
non-bare one, you'd want $repo/.git. And there are even more weird
corner cases, like the fact that using "/my/repo/but/inside/a/subdir"
with for-each-repo will find "/my/repo".
So you might need to refactor prepare_other_repo_env() to split out the
"everything but the config" logic versus the "set GIT_DIR" logic. Or
just inline the former in run_command_on_repo(), though it probably is
better to keep the logic in one place (it's not many lines, but it has
to know about all of the env variables that affect config).
Alternatively, for-each-repo could do repo discovery itself on the paths
it is passed, before calling sub-programs. That's a bigger change, but
possibly it could or should be flagging an error for some cases? I
dunno.
-PeffThere was a problem hiding this comment.
Patrick Steinhardt wrote on the Git mailing list (how to reply to this email):
On Mon, Feb 23, 2026 at 10:34:30PM -0500, Eric Sunshine wrote:
> [Cc:+peff]
>
> On Mon, Feb 23, 2026 at 10:32 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > When run in a worktree, the GIT_DIR directory is set in a different way
> > than in a typical repository. Show this by updating t0068 to include a
> > worktree and add a test that runs from that worktree. This requires
> > moving the repo.key config into a global config instead of the base test
> > repository's local config (demonstrating that it worked with
> > non-worktree Git repositories).
> >
> > The fix is simple: unset the environment variable before looping over
> > the repos.
> >
> > Signed-off-by: Derrick Stolee <stolee@gmail.com>
> > ---
> > diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
> > @@ -60,6 +61,9 @@ int cmd_for_each_repo(int argc,
> > + /* Be sure to not pass GIT_DIR to children. */
> > + unsetenv(GIT_DIR_ENVIRONMENT);
>
> This only unsets GIT_DIR. Is that sufficient in the general case?
> Elsewhere, we recommend[*] unsetting all of Git's local environment
> variables.
Good question indeed. We have the `local_repo_env` array that contains
all the environment variables that may influence repository discovery.
PatrickThere was a problem hiding this comment.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):
On 2/24/26 4:18 AM, Jeff King wrote:
> On Mon, Feb 23, 2026 at 10:34:30PM -0500, Eric Sunshine wrote:
> >>> diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
>>> @@ -60,6 +61,9 @@ int cmd_for_each_repo(int argc,
>>> + /* Be sure to not pass GIT_DIR to children. */
>>> + unsetenv(GIT_DIR_ENVIRONMENT);
>>
>> This only unsets GIT_DIR. Is that sufficient in the general case?
>> Elsewhere, we recommend[*] unsetting all of Git's local environment
>> variables.
>>
>> [*]: From the "githooks" man page: "Environment variables, such as
>> GIT_DIR, GIT_WORK_TREE, etc., are exported so that Git commands run by
>> the hook can correctly locate the repository. If your hook needs to
>> invoke Git commands in a foreign repository or in a different working
>> tree of the same repository, then it should clear these environment
>> variables so they do not interfere with Git operations at the foreign
>> location. For example: `unset $(git rev-parse --local-env-vars)`"
> > Yeah, agreed. There's another subtle issue, which is that this is
> unsetting GIT_DIR in the parent process. So any other code we call that
> is meant to run in the original repo might get confused. I can well
> believe there isn't any such code for a command like for-each-repo, but
> as a general principle, the change should be made in the sub-process.
> > You can stick the elements of local_repo_env into the "env" list of the
> child_process struct. If you grep around, you can find some instances of
> this.
> > There's an open question there of how to handle config in the
> environment, though. Depending on the sub-process, you may or may not
> want such config to pass down to it. For for-each-repo, I'd guess that
> you'd want:
> > git -c foo.bar=baz for-each-repo ...
> > to pass that foo.bar value. We do have a helper to handle that in
> run-command.h:
> > /**
> * Convenience function which prepares env for a command to be run in a
> * new repo. This adds all GIT_* environment variables to env with the
> * exception of GIT_CONFIG_PARAMETERS and GIT_CONFIG_COUNT (which cause the
> * corresponding environment variables to be unset in the subprocess) and adds
> * an environment variable pointing to new_git_dir. See local_repo_env in
> * environment.h for more information.
> */
> void prepare_other_repo_env(struct strvec *env, const char *new_git_dir);
> > Do be careful using it here, though. It expects to set GIT_DIR itself to
> point to the new repo (which is passed in). But I'm not sure that's 100%
> compatible with how for-each-repo works, which is using "git -C $repo"
> under the hood, and letting the usual discovery happen.
> > So for a bare repo, you'd want to pass the repo directory. But for a
> non-bare one, you'd want $repo/.git. And there are even more weird
> corner cases, like the fact that using "/my/repo/but/inside/a/subdir"
> with for-each-repo will find "/my/repo".
> > So you might need to refactor prepare_other_repo_env() to split out the
> "everything but the config" logic versus the "set GIT_DIR" logic. Or
> just inline the former in run_command_on_repo(), though it probably is
> better to keep the logic in one place (it's not many lines, but it has
> to know about all of the env variables that affect config).
Thanks for the recommendations. I'll come back with a more sophisticated
v2 that handles these issues.
> Alternatively, for-each-repo could do repo discovery itself on the paths
> it is passed, before calling sub-programs. That's a bigger change, but
> possibly it could or should be flagging an error for some cases? I
> dunno.
I'm surprised that passing '-C <repo>' doesn't already overwrite these
variables but I suppose environment variables override arguments in this
case. (This is the root of the bug.)
Thanks,
-StoleeThere was a problem hiding this comment.
Jeff King wrote on the Git mailing list (how to reply to this email):
On Tue, Feb 24, 2026 at 07:11:13AM -0500, Derrick Stolee wrote:
> > it is passed, before calling sub-programs. That's a bigger change, but
> > possibly it could or should be flagging an error for some cases? I
> > dunno.
> I'm surprised that passing '-C <repo>' doesn't already overwrite these
> variables but I suppose environment variables override arguments in this
> case. (This is the root of the bug.)
I can see why you'd be surprised if you think of "-C" as "change to this
git repo". But it really is "change to this directory". It is perfectly
OK to "git -C" into a non-toplevel directory of a repo (and continue
respecting any repo discovery that happened already and is in the
environment), or even weird stuff like:
GIT_DIR=/some/repo.git git -C /some/worktree add foo
What you almost kind-of want is "--git-dir", except it puts the onus on
the caller to find the actual repo directory (so detecting bare vs
discovering the .git). Part of the point of introducing -C long ago was
because --git-dir was so annoying to use.
Probably there is room for some middle-ground option, which is "do repo
detection starting in this directory and use that as the --git-dir" (and
I guess also do worktree discovery in the same way). But I don't think
we would ever switch -C to that. It would almost certainly break lots of
people if we changed it now.
-Peff|
User |
|
User |
| @@ -1,5 +1,3 @@ | ||
| #define USE_THE_REPOSITORY_VARIABLE | ||
|
|
||
| #include "builtin.h" |
There was a problem hiding this comment.
Patrick Steinhardt wrote on the Git mailing list (how to reply to this email):
On Tue, Feb 24, 2026 at 03:32:29AM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
> index 325a7925f1..478ccf1287 100644
> --- a/builtin/for-each-repo.c
> +++ b/builtin/for-each-repo.c
> @@ -1,5 +1,3 @@
> -#define USE_THE_REPOSITORY_VARIABLE
> -
> #include "builtin.h"
> #include "config.h"
> #include "gettext.h"
> @@ -33,7 +31,7 @@ static int run_command_on_repo(const char *path, int argc, const char ** argv)
> int cmd_for_each_repo(int argc,
> const char **argv,
> const char *prefix,
> - struct repository *repo UNUSED)
> + struct repository *repo)
> {
> static const char *config_key = NULL;
> int keep_going = 0;
> @@ -55,7 +53,7 @@ int cmd_for_each_repo(int argc,
> if (!config_key)
> die(_("missing --config=<config>"));
>
> - err = repo_config_get_string_multi(the_repository, config_key, &values);
> + err = repo_config_get_string_multi(repo, config_key, &values);
> if (err < 0)
> usage_msg_optf(_("got bad config --config=%s"),
> for_each_repo_usage, options, config_key);
The command is marked as `RUN_SETUP_GENTLY`, so it may run in a context
where there is no repository. In such cases, `repo` would be `NULL`, and
that would cause the command to segfault here, wouldn't it?
PatrickThere was a problem hiding this comment.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):
On 2/24/26 4:18 AM, Patrick Steinhardt wrote:
> On Tue, Feb 24, 2026 at 03:32:29AM +0000, Derrick Stolee via GitGitGadget wrote:
>> diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
>> index 325a7925f1..478ccf1287 100644
>> --- a/builtin/for-each-repo.c
>> +++ b/builtin/for-each-repo.c
>> @@ -1,5 +1,3 @@
>> -#define USE_THE_REPOSITORY_VARIABLE
>> -
>> #include "builtin.h"
>> #include "config.h"
>> #include "gettext.h"
>> @@ -33,7 +31,7 @@ static int run_command_on_repo(const char *path, int argc, const char ** argv)
>> int cmd_for_each_repo(int argc,
>> const char **argv,
>> const char *prefix,
>> - struct repository *repo UNUSED)
>> + struct repository *repo)
>> {
>> static const char *config_key = NULL;
>> int keep_going = 0;
>> @@ -55,7 +53,7 @@ int cmd_for_each_repo(int argc,
>> if (!config_key)
>> die(_("missing --config=<config>"));
>> >> - err = repo_config_get_string_multi(the_repository, config_key, &values);
>> + err = repo_config_get_string_multi(repo, config_key, &values);
>> if (err < 0)
>> usage_msg_optf(_("got bad config --config=%s"),
>> for_each_repo_usage, options, config_key);
> > The command is marked as `RUN_SETUP_GENTLY`, so it may run in a context
> where there is no repository. In such cases, `repo` would be `NULL`, and
> that would cause the command to segfault here, wouldn't it?
Ah. That's an interesting subtlety of the setup that I did not know.
I'll make sure this is covered in tests, because the current tests run in
the default test repo but our expected use case should be outside a repo.
Thanks,
-Stolee|
User |
The 'git for-each-repo' tool is frequently run outside of a repo context in the real world. For example, it powers background maintenance. Despite this typical case, we have not been testing it without a local repository. Update t0068 to stop creating a test repo and to use global config everywhere. This has some subtle changes to test across the file. This was noticed because an earlier attempt to remove the_repository from builtin/for-each-repo.c did not catch a segmentation fault since the passed 'repo' is NULL. This use of the_repository will need to stay until we have a better way to handle config queries outside of a repo context. Similar use still exists in builtin/config.c for the same reason. Signed-off-by: Derrick Stolee <stolee@gmail.com>
a47f9e9 to
4e3f4aa
Compare
|
/submit |
|
Submitted as pull.2056.v2.git.1771968924.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
| @@ -2,6 +2,7 @@ | ||
|
|
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> static int run_command_on_repo(const char *path, int argc, const char ** argv)
> {
> - int i;
> + int res;
> struct child_process child = CHILD_PROCESS_INIT;
> + char **envvars;
> + size_t envvar_nr = 0;
> char *abspath = interpolate_path(path, 0);
>
> + while (local_repo_env[envvar_nr])
> + envvar_nr++;
> +
> + CALLOC_ARRAY(envvars, envvar_nr);
> +
> + for (size_t i = 0; i < envvar_nr; i++) {
> + envvars[i] = getenv(local_repo_env[i]);
> +
> + if (envvars[i]) {
> + unsetenv(local_repo_env[i]);
> + envvars[i] = xstrdup(envvars[i]);
> + }
> + }
>
> child.git_cmd = 1;
> strvec_pushl(&child.args, "-C", abspath, NULL);
>
> - for (i = 0; i < argc; i++)
> + for (int i = 0; i < argc; i++)
> strvec_push(&child.args, argv[i]);
>
> free(abspath);
>
> - return run_command(&child);
> + res = run_command(&child);
> +
> + for (size_t i = 0; i < envvar_nr; i++) {
> + if (envvars[i]) {
> + setenv(local_repo_env[i], envvars[i], 1);
> + free(envvars[i]);
> + }
> + }
> +
> + free(envvars);
> + return res;
> }
Doesn't run_command() let you unsetenv in the child without
affecting the parent process?
Looking at run-command.c:prep_childenv(), it seems that you can pass
"VAR=VAL" to "export VAR=VAL" in the child, and pass "VAR" to "unset
VAR" in the child.
Or is it essential to unset in both parent and child while the child
is working and that is why we unset in the parent and then restore
later? I find this highly confusing.
> int cmd_for_each_repo(int argc,
> diff --git a/t/t0068-for-each-repo.sh b/t/t0068-for-each-repo.sh
> index 512af34c82..d55557a934 100755
> --- a/t/t0068-for-each-repo.sh
> +++ b/t/t0068-for-each-repo.sh
> @@ -10,7 +10,7 @@ TEST_NO_CREATE_REPO=1
> test_expect_success 'run based on configured value' '
> git init one &&
> git init two &&
> - git init three &&
> + git -C two worktree add --orphan ../three &&
> git init ~/four &&
> git -C two commit --allow-empty -m "DID NOT RUN" &&
> git config --global run.key "$TRASH_DIRECTORY/one" &&
> @@ -35,7 +35,17 @@ test_expect_success 'run based on configured value' '
> git -C three log -1 --pretty=format:%s >message &&
> grep again message &&
> git -C ~/four log -1 --pretty=format:%s >message &&
> - grep again message
> + grep again message &&
> +
> + git -C three for-each-repo --config=run.key -- commit --allow-empty -m "ran from worktree" &&
> + git -C one log -1 --pretty=format:%s >message &&
> + grep worktree message &&
> + git -C two log -1 --pretty=format:%s >message &&
> + ! grep worktree message &&
> + git -C three log -1 --pretty=format:%s >message &&
> + grep worktree message &&
> + git -C ~/four log -1 --pretty=format:%s >message &&
> + grep worktree message
> '
>
> test_expect_success 'do nothing on empty config' 'There was a problem hiding this comment.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):
On 2/24/26 4:47 PM, Junio C Hamano wrote:
> Doesn't run_command() let you unsetenv in the child without
> affecting the parent process?
> > Looking at run-command.c:prep_childenv(), it seems that you can pass
> "VAR=VAL" to "export VAR=VAL" in the child, and pass "VAR" to "unset
> VAR" in the child.
You're right. Here's a much simpler implementation:
static int run_command_on_repo(const char *path, int argc, const char ** argv)
{
int i = 0;
struct child_process child = CHILD_PROCESS_INIT;
char *abspath = interpolate_path(path, 0);
while (local_repo_env[i]) {
strvec_push(&child.env, local_repo_env[i]);
i++;
}
child.git_cmd = 1;
strvec_pushl(&child.args, "-C", abspath, NULL);
for (i = 0; i < argc; i++)
strvec_push(&child.args, argv[i]);
free(abspath);
return run_command(&child);
}
> Or is it essential to unset in both parent and child while the child
> is working and that is why we unset in the parent and then restore
> later? I find this highly confusing.
Nope, not necessary to adjust it in the parent. The simpler version
above works in my test case. I'll apply it to an upcoming v3, but
will wait a couple of days to see if there is any more feedback on
this v2.5 before doing so.
Thanks,
-StoleeThere was a problem hiding this comment.
Jeff King wrote on the Git mailing list (how to reply to this email):
On Wed, Feb 25, 2026 at 06:44:51AM -0500, Derrick Stolee wrote:
> > Looking at run-command.c:prep_childenv(), it seems that you can pass
> > "VAR=VAL" to "export VAR=VAL" in the child, and pass "VAR" to "unset
> > VAR" in the child.
>
> You're right. Here's a much simpler implementation:
>
> static int run_command_on_repo(const char *path, int argc, const char ** argv)
> {
> int i = 0;
> struct child_process child = CHILD_PROCESS_INIT;
> char *abspath = interpolate_path(path, 0);
>
> while (local_repo_env[i]) {
> strvec_push(&child.env, local_repo_env[i]);
> i++;
> }
You can actually just use strvec_pushv() to do this as a one-liner
(though annoyingly you need a cast because of how const works; you can
easily find an example with grep).
But I really think you should consider keeping config-related variables
in place, as prepare_other_repo_env() does. Otherwise something like:
git -c pack.threads=1 for-each-repo repack -ad
will ignore that config in the sub-processes (whereas it currently is
respected).
And for that, you do need to loop yourself.
-PeffThere was a problem hiding this comment.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):
On 2/25/2026 8:13 AM, Jeff King wrote:
> On Wed, Feb 25, 2026 at 06:44:51AM -0500, Derrick Stolee wrote:
>
>>> Looking at run-command.c:prep_childenv(), it seems that you can pass
>>> "VAR=VAL" to "export VAR=VAL" in the child, and pass "VAR" to "unset
>>> VAR" in the child.
> But I really think you should consider keeping config-related variables
> in place, as prepare_other_repo_env() does. Otherwise something like:
>
> git -c pack.threads=1 for-each-repo repack -ad
>
> will ignore that config in the sub-processes (whereas it currently is
> respected).
>
> And for that, you do need to loop yourself.
Great point. Here's another attempt:
static int run_command_on_repo(const char *path, int argc, const char ** argv)
{
int i = 0;
struct child_process child = CHILD_PROCESS_INIT;
char *abspath = interpolate_path(path, 0);
while (local_repo_env[i]) {
/*
* Preserve pre-builtin options:
* - CONFIG_ENVIRONMENT, CONFIG_DATA_ENVIRONMENT, and
* CONFIG_COUNT_ENVIRONMENT persist -c <name>=<value>
* and --config-env=<name>=<envvar> options.
* - NO_REPLACE_OBJECTS_ENVIRONMENT persists the
* --no-replace-objects option.
*
* Note that the following options are not in local_repo_env:
* - EXEC_PATH_ENVIRONMENT persists --exec-path option.
*/
if (strncmp(local_repo_env[i], "CONFIG_", 7) &&
strcmp(local_repo_env[i], NO_REPLACE_OBJECTS_ENVIRONMENT))
strvec_push(&child.env, local_repo_env[i]);
i++;
}
child.git_cmd = 1;
strvec_pushl(&child.args, "-C", abspath, NULL);
for (i = 0; i < argc; i++)
strvec_push(&child.args, argv[i]);
free(abspath);
return run_command(&child);
}
This comment details my findings from comparing the list in
local_repo_env[] and the top-level options listed in
Documentation/git.adoc. That's how I was able to find that
--exec-path sets an environment variable that's NOT in the
list and we want to be sure we don't set it.
Should we add the comparison to EXEC_PATH_ENVIRONMENT as a
precaution to make sure it's not added to local_repo_env in
the future? Or is that too defensive?
Thanks,
-Stolee
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
Derrick Stolee <stolee@gmail.com> writes:
> On 2/25/2026 8:13 AM, Jeff King wrote:
>> On Wed, Feb 25, 2026 at 06:44:51AM -0500, Derrick Stolee wrote:
>>
>>>> Looking at run-command.c:prep_childenv(), it seems that you can pass
>>>> "VAR=VAL" to "export VAR=VAL" in the child, and pass "VAR" to "unset
>>>> VAR" in the child.
>
>> But I really think you should consider keeping config-related variables
>> in place, as prepare_other_repo_env() does. Otherwise something like:
>>
>> git -c pack.threads=1 for-each-repo repack -ad
>>
>> will ignore that config in the sub-processes (whereas it currently is
>> respected).
>>
>> And for that, you do need to loop yourself.
>
> Great point. Here's another attempt:
>
> static int run_command_on_repo(const char *path, int argc, const char ** argv)
> {
> int i = 0;
> struct child_process child = CHILD_PROCESS_INIT;
> char *abspath = interpolate_path(path, 0);
>
> while (local_repo_env[i]) {
> /*
> * Preserve pre-builtin options:
> * - CONFIG_ENVIRONMENT, CONFIG_DATA_ENVIRONMENT, and
> * CONFIG_COUNT_ENVIRONMENT persist -c <name>=<value>
> * and --config-env=<name>=<envvar> options.
> * - NO_REPLACE_OBJECTS_ENVIRONMENT persists the
> * --no-replace-objects option.
> *
> * Note that the following options are not in local_repo_env:
> * - EXEC_PATH_ENVIRONMENT persists --exec-path option.
> */
> if (strncmp(local_repo_env[i], "CONFIG_", 7) &&
Minor nit: !starts_with() lets you avoid counting bytes yourself and
hardcoding "7" here.
> strcmp(local_repo_env[i], NO_REPLACE_OBJECTS_ENVIRONMENT))
> strvec_push(&child.env, local_repo_env[i]);
>
> i++;
> }
>
> child.git_cmd = 1;
> strvec_pushl(&child.args, "-C", abspath, NULL);
>
> for (i = 0; i < argc; i++)
> strvec_push(&child.args, argv[i]);
If argv[argc] == NULL, then here is where we want strvec_pushv().
> free(abspath);
>
> return run_command(&child);
> }
>
> This comment details my findings from comparing the list in
> local_repo_env[] and the top-level options listed in
> Documentation/git.adoc. That's how I was able to find that
> --exec-path sets an environment variable that's NOT in the
> list and we want to be sure we don't set it.
Hmph, wouldn't we want to use specified exec-path inside ...
git --exec-path=~/my/git/libexec for-each-repo sh -c "do things"
... "do things" script when we find Git related binaries? Or am I
not getting what you are describing here?
> Should we add the comparison to EXEC_PATH_ENVIRONMENT as a
> precaution to make sure it's not added to local_repo_env in
> the future? Or is that too defensive?
>
> Thanks,
> -StoleeThere was a problem hiding this comment.
Phillip Wood wrote on the Git mailing list (how to reply to this email):
On 26/02/2026 16:21, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> >> static int run_command_on_repo(const char *path, int argc, const char ** argv)
>> {
>> int i = 0;
>> struct child_process child = CHILD_PROCESS_INIT;
>> char *abspath = interpolate_path(path, 0);
>>
>> while (local_repo_env[i]) {
>> /*
>> * Preserve pre-builtin options:
>> * - CONFIG_ENVIRONMENT, CONFIG_DATA_ENVIRONMENT, and
>> * CONFIG_COUNT_ENVIRONMENT persist -c <name>=<value>
>> * and --config-env=<name>=<envvar> options.
>> * - NO_REPLACE_OBJECTS_ENVIRONMENT persists the
>> * --no-replace-objects option.
>> *
>> * Note that the following options are not in local_repo_env:
>> * - EXEC_PATH_ENVIRONMENT persists --exec-path option.
>> */
>> if (strncmp(local_repo_env[i], "CONFIG_", 7) &&
> > Minor nit: !starts_with() lets you avoid counting bytes yourself and
> hardcoding "7" here.
More seriously it should be looking for strings starting with "GIT_CONFIG_", not the name of the preprocessor definitions.
Thanks
PhillipThere was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
Phillip Wood <phillip.wood123@gmail.com> writes:
>>> * Note that the following options are not in local_repo_env:
>>> * - EXEC_PATH_ENVIRONMENT persists --exec-path option.
>>> */
>>> if (strncmp(local_repo_env[i], "CONFIG_", 7) &&
>>
>> Minor nit: !starts_with() lets you avoid counting bytes yourself and
>> hardcoding "7" here.
>
> More seriously it should be looking for strings starting with
> "GIT_CONFIG_", not the name of the preprocessor definitions.
Thanks. I missed that completely.There was a problem hiding this comment.
Derrick Stolee wrote on the Git mailing list (how to reply to this email):
On 2/27/26 2:41 PM, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> >>>> * Note that the following options are not in local_repo_env:
>>>> * - EXEC_PATH_ENVIRONMENT persists --exec-path option.
>>>> */
>>>> if (strncmp(local_repo_env[i], "CONFIG_", 7) &&
>>>
>>> Minor nit: !starts_with() lets you avoid counting bytes yourself and
>>> hardcoding "7" here.
>>
>> More seriously it should be looking for strings starting with
>> "GIT_CONFIG_", not the name of the preprocessor definitions.
> > Thanks. I missed that completely.
Same! And I will try to find a way to test these things to ensure
these mistakes are not prevented only by careful code reviewers!
Thanks,
-StoleeThere was a problem hiding this comment.
Jeff King wrote on the Git mailing list (how to reply to this email):
On Thu, Feb 26, 2026 at 10:29:47AM -0500, Derrick Stolee wrote:
> Great point. Here's another attempt:
>
> static int run_command_on_repo(const char *path, int argc, const char ** argv)
> {
> int i = 0;
> struct child_process child = CHILD_PROCESS_INIT;
> char *abspath = interpolate_path(path, 0);
>
> while (local_repo_env[i]) {
> /*
> * Preserve pre-builtin options:
> * - CONFIG_ENVIRONMENT, CONFIG_DATA_ENVIRONMENT, and
> * CONFIG_COUNT_ENVIRONMENT persist -c <name>=<value>
> * and --config-env=<name>=<envvar> options.
> * - NO_REPLACE_OBJECTS_ENVIRONMENT persists the
> * --no-replace-objects option.
> *
> * Note that the following options are not in local_repo_env:
> * - EXEC_PATH_ENVIRONMENT persists --exec-path option.
> */
> if (strncmp(local_repo_env[i], "CONFIG_", 7) &&
> strcmp(local_repo_env[i], NO_REPLACE_OBJECTS_ENVIRONMENT))
> strvec_push(&child.env, local_repo_env[i]);
This is slightly different than what prepare_other_repo_env() does:
- it doesn't drop GIT_CONFIG_*, but assumes that removing
GIT_CONFIG_COUNT is enough for GIT_CONFIG_KEY/VALUE to be ignored
(and then also removes GIT_CONFIG_PARAMETERS, of course)
- it doesn't consider NO_REPLACE_OBJECTS at all
I think you could make arguments either way about what should happen
when spawning a command in another repo. But I'd really prefer for us to
have a single spot to specify that policy, and not subtly-different
behavior from different commands. So I'd really like to see this using
that other function (or the logic from it factored out into a helper).
And then we can consider whether to make changes to that policy.
Dropping GIT_CONFIG_* from the environment does make sense in general,
but it doesn't actually happen with the patch above (because only
GIT_CONFIG_COUNT is in the local_repo_env list; to find the others we'd
have to actually enumerate the current environment).
For NO_REPLACE_OBJECTS, I think you could argue that it should not be in
local_repo_env at all. It is more about the operation being performed,
not the repository itself. So for example in this command:
git --no-replace-objects fetch
I would expect that NO_REPLACE_OBJECTS to make it down to any submodule
fetches we do. Likewise for other operation-level variables like
GIT_LITERAL_PATHSPECS, but those are already (correctly IMHO) omitted
from local_repo_env.
It looks like NO_REPLACE_OBJECTS got pulled from the connect.c code in
48a7c1c49d (Refactor list of of repo-local env vars, 2010-02-25). And I
could see somebody wanting to make sure that upload-pack behaved
predictably with respect to replace refs, but it already does: it
disables replace refs itself as part of its startup code.
> This comment details my findings from comparing the list in
> local_repo_env[] and the top-level options listed in
> Documentation/git.adoc. That's how I was able to find that
> --exec-path sets an environment variable that's NOT in the
> list and we want to be sure we don't set it.
>
> Should we add the comparison to EXEC_PATH_ENVIRONMENT as a
> precaution to make sure it's not added to local_repo_env in
> the future? Or is that too defensive?
I don't think we need to bother. Obviously adding it to local_repo_env
would be the wrong thing, but that is true of lots of variables. Trying
to make a list is just going to result in a list that is out-of-date,
because there's nothing pushing people to update it when they introduce
a new variable.
You can imagine a different world, where we had a single list of all
environment variables, and new ones _had_ to be added to the list in
order to function, and each entry had a bitflag for "this is a
local-repo value", then that might force each new addition to consider
whether it should be added. But we don't have such a list, and I think
structuring things that way would introduce new complications and
awkwardness.
So IMHO we should just rely on review to reject a patch that tries
something silly like adding EXEC_PATH_ENVIRONMENT to local_repo_env.
-PeffThere was a problem hiding this comment.
Jeff King wrote on the Git mailing list (how to reply to this email):
On Thu, Feb 26, 2026 at 08:21:23AM -0800, Junio C Hamano wrote:
> > This comment details my findings from comparing the list in
> > local_repo_env[] and the top-level options listed in
> > Documentation/git.adoc. That's how I was able to find that
> > --exec-path sets an environment variable that's NOT in the
> > list and we want to be sure we don't set it.
>
> Hmph, wouldn't we want to use specified exec-path inside ...
>
> git --exec-path=~/my/git/libexec for-each-repo sh -c "do things"
>
> ... "do things" script when we find Git related binaries? Or am I
> not getting what you are describing here?
I almost responded with the same thing, but I think the suggestion is
going the other way: we (correctly) do not list EXEC_PATH_ENVIRONMENT
via local_repo_env, so it will never be removed from the environment.
And thus we do not need to do anything here to drop it from the list of
what is removed. Double negation. :)
The second paragraph:
> > Should we add the comparison to EXEC_PATH_ENVIRONMENT as a
> > precaution to make sure it's not added to local_repo_env in
> > the future? Or is that too defensive?
makes that more clear, I think. I did have to read the whole thing
twice. ;)
-Peff|
This branch is now known as |
|
This patch series was integrated into seen via git@2fe0b01. |
|
There was a status update in the "New Topics" section about the branch "git for-each-repo" started from a secondary worktree did not work as expected, which has been corrected. Expecting a (hopefully small and final) reroll? cf. <20260225131344.GA2139176@coredump.intra.peff.net> source: <pull.2056.v2.git.1771968924.gitgitgadget@gmail.com> |
When run in a worktree, the GIT_DIR directory is set in a different way than in a typical repository. Show this by updating t0068 to include a worktree and add a test that runs from that worktree. This requires moving the repo.key config into a global config instead of the base test repository's local config (demonstrating that it worked with non-worktree Git repositories). We need to be careful to unset the local Git environment variables and let the child process rediscover them, while also reinstating those variables in the parent process afterwards. Update run_command_on_repo() to store, unset, then reset the non-NULL variables. But... ...we also need to be careful to not un-set environment variables around -c config options or --no-replace-objects. A detailed comment includes the justification for these filters. Reported-by: Matthew Gabeler-Lee <fastcat@gmail.com> Signed-off-by: Derrick Stolee <stolee@gmail.com>
4e3f4aa to
df9a480
Compare
|
User |
|
This patch series was integrated into seen via git@973645e. |
|
There was a status update in the "Cooking" section about the branch "git for-each-repo" started from a secondary worktree did not work as expected, which has been corrected. Expecting a (hopefully small and final) reroll? cf. <20260225131344.GA2139176@coredump.intra.peff.net> source: <pull.2056.v2.git.1771968924.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@deea83a. |
|
This patch series was integrated into seen via git@0623443. |
|
This patch series was integrated into seen via git@5f8278b. |
|
There was a status update in the "Cooking" section about the branch "git for-each-repo" started from a secondary worktree did not work as expected, which has been corrected. Expecting a (hopefully small and final) reroll? cf. <20260225131344.GA2139176@coredump.intra.peff.net> source: <pull.2056.v2.git.1771968924.gitgitgadget@gmail.com> |
This was reported by Matthew [1] and is a quick fix.
[1] https://lore.kernel.org/git/CABpCjbY=wpStuhxqRJ5TSNV3A-CmN-g-xZGJOQGSSv3GYhs2fQ@mail.gmail.com/
I also took the liberty of removing the_repository as I wanted to make sure that wasn't involved here.
Thanks,
-Stolee
cc: gitster@pobox.com
cc: fastcat@gmail.com
cc: Eric Sunshine sunshine@sunshineco.com
cc: Jeff King peff@peff.net
cc: Patrick Steinhardt ps@pks.im
cc: Phillip Wood phillip.wood123@gmail.com