Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Use a shared buffer in calls of git_treebuilder_write to avoid heap contention#3892

Merged
ethomson merged 1 commit intolibgit2:masterlibgit2/libgit2:masterfrom
mitesch:shared_buffermitesch/libgit2:shared_bufferCopy head branch name to clipboard
Jan 21, 2017
Merged

Use a shared buffer in calls of git_treebuilder_write to avoid heap contention#3892
ethomson merged 1 commit intolibgit2:masterlibgit2/libgit2:masterfrom
mitesch:shared_buffermitesch/libgit2:shared_bufferCopy head branch name to clipboard

Conversation

@mitesch
Copy link
Contributor

@mitesch mitesch commented Aug 8, 2016

When doing many merges in parallel in a large repository, we saw performance degrade rapidly due to memory allocations in git_treebuilder_write. This change uses a shared buffer created in git_tree__write_index to avoid the small allocation per tree in git_treebuilder_write.

In our tests against that repository we saw improvements of ~50% in merges under heavy load.

@ethomson
Copy link
Member

ethomson commented Aug 8, 2016

Cool, that sounds like a great improvement!

First, please revert any changes that are only whitespace - it seems like a bunch of those snuck in. This will help avoid unnecessary churn.

Second, I think that a more straightforward change would be to give git_treebuilder its own git_buf that it reuses - that will ensure that all users of the treebuilder get this improvement, not just git_tree__write_index. Does that seem reasonable?

src/tree.c Outdated

/* Grow the buffer beforehand to an estimated size */
error = git_buf_grow(&buffer, entrycount * 72);
if (error >= 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually avoid braces around single-statement branches

@pks-t
Copy link
Member

pks-t commented Aug 9, 2016

Small nit on the commit message itself: please try to keep the initial line short, as otherwise many git commands become hard to read (e.g. git log --oneline). I'd propose something along the lines of

treebuilder: use shared buffer for writing trees

The function to write trees allocates a new buffer for each tree.
This causes problems with performance when performing a lot
of actions involving writing trees, e.g. when doing many merges.
Fix the issue by instead handing in a shared buffer, which is then
re-used across the calls without having to re-allocate between
calls.

And I agree with @ethomson on that it would be nicer to have a buffer at the treebuilder itself which would be reused for other functions, as well.

Anyway, this heads in a good direction. Thanks for contributing :)

@mitesch
Copy link
Contributor Author

mitesch commented Aug 9, 2016

Placing the buffer on the builder is a bit tricky, I think. write_tree creates a new treebuilder each call and only calls git_treebuilder_write once with it, so I don't think just that change would reduce the allocations. The builder can't be shared across write_tree calls because of the recursion in the loop.

I could put the buffer on the treebuilder, but I'd have to manage the buffer the same way I do now and just set bld->buf in write_tree. That doesn't seem better though.

@ethomson
Copy link
Member

ethomson commented Aug 9, 2016

Aha - thanks for clarifying, @mitesch - obviously you're quite right and I didn't understand the lifecycle there when I was re-reading it.

In that case, yes, I think this is great. @pks-t 's correct about our style - we follow the kernel / git styles. If you wouldn't mind squashing a fixup commit that matches the style, that would be great!

src/tree.c Outdated
if ((ret = git_buf_grow(&shared_buf, 1024)) >= 0)
{
ret = write_tree(oid, repo, index, "", 0, &shared_buf);
git_buf_free(&shared_buf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The free doesn't need to be conditional, it'll do the right thing in case of OOM.

src/tree.c Outdated
entrycount = git_strmap_num_entries(bld->map);

/* Grow the buffer beforehand to an estimated size */
error = git_buf_grow(&buffer, entrycount * 72);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the opposite of how we do error handling in the library. We use early returns or jumps to cleanup if things go wrong instead of putting the success path inside ifs.

@carlosmn
Copy link
Member

carlosmn commented Aug 9, 2016

Mostly some style/nomenclature comments, but overall it's looking good. It might be worth making this part of the public API so external users of this function can make use of it as well.

The tree update might also benefit from doing something like this, which is another semi-heavy user of this.

@mitesch
Copy link
Contributor Author

mitesch commented Aug 9, 2016

Pushed a fixup with most of the requested changes. I'll squash once everything is 100%
Couple things still open:

  • Where the initial buffer growth should be.
  • Should I make this part of the public API?

@mitesch mitesch force-pushed the shared_buffer branch 3 times, most recently from 10c752f to d368e87 Compare September 28, 2016 14:26
@mitesch
Copy link
Contributor Author

mitesch commented Sep 29, 2016

@ethomson @carlosmn squashed and should be good to go

src/tree.h Outdated
void git_tree__free(void *tree);
int git_tree__parse(void *tree, git_odb_object *obj);

int git_treebuilder_write_with_buffer(git_oid *oid, git_treebuilder *bld, git_buf *tree);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - one more minor nit. For functions that are internal to the library, we use a double-underscore. For example, this would be git_treebuilder__write_with_buffer.

But there's no need to put this in a header at all; what I would prefer to see is to revert the changes to tree.h and instead declare write_with_buffer as static in tree.c. Would you be so kind? Thanks again!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other direction to go to would be to make it part of the public API. This declaration would then be moved to include/git2/tree.h and would need some documentation. I'd be in favour of making it public as it can be useful for external users of the tree builder as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to just making this public. That would be great.

The function to write trees allocates a new buffer for each tree.
This causes problems with performance when performing a lot
of actions involving writing trees, e.g. when doing many merges.
Fix the issue by instead handing in a shared buffer, which is then
re-used across the calls without having to re-allocate between
calls.
@mitesch
Copy link
Contributor Author

mitesch commented Dec 12, 2016

@ethomson @carlosmn Updated, sorry for the delay. I moved the definition to include/git2/tree.h. Let me know if the documentation is sufficient.

@ethomson ethomson merged commit 44e8af8 into libgit2:master Jan 21, 2017
tommoor added a commit to goabstract/libgit2 that referenced this pull request Mar 24, 2017
This reverts commit 44e8af8, reversing
changes made to 3b4eb10.
tommoor added a commit to goabstract/libgit2 that referenced this pull request Mar 25, 2017
This reverts commit 44e8af8, reversing
changes made to 3b4eb10.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.