Continuing the OSS-Fuzz Integration Conversation #1889

Mar 31, 2024

DaveLak
Mar 31, 2024

Hey @Byron 👋

A problem I see is that reproductions are usually hidden behind a login, so it requires extra work to make these available here so the community has a chance to fix them. However, I am open to give it a shot and see how it goes.

Considering the project's maintenance mode status and the issue backlog's community-driven nature, I can appreciate that. Those are important factors to consider when deciding what APIs are worth fuzzing; there's no sense in burying human user issues in a sea of weird fuzzer-generated edge cases. I'd like to get your perspective on that down the road.

I definitely welcome you to help with maintaining the fuzzer integration, and would appreciate if that would also entail the maintenance of the issues it finds such that all information for reproduction would be contained within them so it's not only maintainers who can fix them.

Sure thing. I'd be happy to help with all things fuzz-related, including OSS-Fuzz issue triage and surfacing the available information to reproduce bugs in the GH issue tracker when appropriate.

That said, in the interest of setting expectations:

I'd definitely need some of your time and input while getting started. I wouldn't be comfortable determining a bug's severity or relevance to the project's threat model without your assistance for at least the first handful, so I'd ask that we review the first few together (email or whatever medium works for you is fine by me).
I'd also like to share a bit of context about myself and my skill set to give you a better idea of who I am and how I ended up writing this lol. I'll add that in a follow-up message.
Besides that, I think the only thing I'd need is your permission to be an auto_cc in the OSS-Fuzz repo's project.yaml for this project.
- I'd expect most of the issues that come up will be less vulnerability-related and more in line with the unhandled exception, but that kicked this off, so I don't know how sensitive that access actually is. I totally understand if you would prefer to defer this bit and figure out another way to share any error logs at this stage. Let me know whatever you prefer.

Finally (for now), I have a few questions for you regarding next steps:

Are you ok with me migrating the two existing fuzzers into this repo? If yes,

I'm happy to get started on that, but the one thing I'm unsure of is how to handle the licensing of those files. As you can see, they are licensed under Apache 2 and have header comments stating as much. I'm happy to raise the question to maintainers in the OSS-Fuzz repo, but if you don't mind the licence header (at least initially) an alternate approach would be to bring them over as is for now and just replace them with new, from scratch tests in a follow-up (my open PR got the existing tests running again but there is definitely improvements to be had so we could just replace them instead.)
Is there a specific directory layout in the repo you'd prefer to place them or does a toplevel fuzzing directory work?

Could you give me a high level (like a few bullet points would be super helpful) picture of what the threat model for this library might look like and APIs that might be of interest related to it?

For example, are there any common use cases you're aware of that involve processing untrusted user input?
Is this a silly question considering this is one of the most popular packages on PyPi with almost 60 million downloads a month? 😅 😅 😅

Byron · Apr 2, 2024

Byron
Apr 2, 2024
Maintainer

Thanks for the summary, it's appreciated!

I'd definitely need some of your time and input while getting started. I wouldn't be comfortable determining a bug's severity or relevance to the project's threat model without your assistance for at least the first handful, so I'd ask that we review the first few together (email or whatever medium works for you is fine by me).

I understand, and communicating in discussions or issues will work for me.

I'd also like to share a bit of context about myself and my skill set to give you a better idea of who I am and how I ended up writing this lol. I'll add that in a follow-up message.

Learning your motivation will be interesting :). My motivation to even take this on is to improve the correctness of the library, knowing that it will never be perfectly correct due to its sometimes hacky nature.

Besides that, I think the only thing I'd need is your permission to be an auto_cc in the OSS-Fuzz repo's project.yaml for this project.

Please feel free to add yourself there and I shall try to approve such PR. Generally, since I won't fix security issues, it's best to have them in the open glaringly so the community steps up to fix it. While at it, I'd appreciate if you could add @EliahKagan to the CC list as well as he is definitely more competent in all of this, particularly in spotting security relevant issues.

Are you ok with me migrating the two existing fuzzers into this repo? If yes,

Yes, please do. Putting it into a top-level fuzzing seems like a good idea, feel free to use any structure you like within it.
Regarding the licensing of the existing files, maybe you can add to the 'licensing' section of the README that parts in fuzzing are under the apache-2 license, which shouldn't be relevant to any release that's made. It's probably worth validating that fuzzing doesn't get picked up by the publishing process though (which I expect it won't as it's include-list based).

Could you give me a high level (like a few bullet points would be super helpful) picture of what the threat model for this library might look like and APIs that might be of interest related to it?

For example, are there any common use cases you're aware of that involve processing untrusted user input?

Is this a silly question considering this is one of the most popular packages on PyPi with almost 60 million downloads a month? 😅 😅 😅

It's 60m per month? Oh my, last time I checked it was 30m and that was like a year ago! Thus, you will probably find the answer silly as there is no threat-model to speak of. Issues are raised, advisories are created, sometimes they are fixed, and that's already it.

Generally, I don't think that GitPython ever has a chance to be correct enough not to be exploitable by merit of being written in Python and relying on command invocation a lot. Further, some code does read local files and it's definitely not en-par with similar code in Git, so could fall pray to all kinds of issues. But that's how it is, and I think the only real solution is to eventually get people to migrate to gitoxide, but that's quite long way out, both to become feature-complete, stable, and available from Python.

So maybe the threat-model is: Don't use GitPython? Use gitoxide, and write everything in Rust 😅.

5 replies

Byron Apr 2, 2024
Maintainer

On another note, the gitoxide fuzzer is currently broken and won't run on their systems, it's still not fixed as I don't interact with the integration itself. Just in case that's easy for you to do and of interest.

DaveLak Apr 2, 2024
Author

I'm definitely happy to take a look but I can't promise I'll be very helpful on that specifically as I've mostly been playing with the C, Python and Go fuzzing so far and from what I've observed the rust builds are in a somewhat precarious state right now for a number of reasons, one of which being related to google/oss-fuzz#11626 and the complexity of an ongoing attempt to upgrade clang from 15 to 18 in google/oss-fuzz#11714

Byron Apr 3, 2024
Maintainer

Thanks for researching the issue and the PR, that's already helpful as I now know that it's not only gitoxide with a problem, but that it's general. Further, I am now subscribed to the issue for a heads-up - great :)!

DaveLak Apr 23, 2024
Author

I'm starting to take a peak at the gitoxide setup and I noticed that the coverage build failure you linked above seems to be resolved. I see there was another similar issue that also resolved but I don't see any more broken builds logged in that issue tracker so I'm curious:

Did you do something intentional that fixed them?
Is the build issue something that you see more often or intermittently than the issues I linked here suggest?

Byron Apr 23, 2024
Maintainer

Thanks for taking a look!

Regarding 1), I didn't do anything related to fuzzers recently, and I don't think this resolution could have been caused by my actions.

Usually, OSS-fuzzing runs were rock-solid, no intermittent build failures. Interesting that the issue on the OSS-fuzz infrastructure isn't closed yet given that it seems to work again. But maybe it doesn't send notifications for this.

EliahKagan · Apr 2, 2024

DaveLak
Apr 2, 2024
Author

Thanks for the summary, it's appreciated!

Right back at you!

communicating in discussions or issues will work for me.

👍

Learning your motivation will be interesting :).

I can't promise "interesting" but if I'm able to provide enough context to dispel the possibility of unintentionally misrepresenting myself through unspoken assumptions, I'd consider that a success lol.

Basically I just wanted to say:
Hey, I'm Dave (or David if you prefer) and I've been working in web development for what I guess must be a bit over 10 years. I got started in the WordPress/PHP world initially before moving into the JS/TS/Node ecosystem where the bulk of my experience has been in e-commerce and product dev roles. So the take away I'm hoping to offer here is: while I certainly consider myself competent with regard to both Python and secure software development, I don't intend to suggest that I specialize in either.

Beyond the obvious last point, that context is also relevant to how I ended up in this repo:

I attribute much of the opportunity I've been afforded personally and professionally to the OSS community that has enabled me along the way, and I've been looking for a way to contribute meaningful value back for a while.
Over the past few years, I've become quite interested in code quality generally and automated testing patterns specifically.

So when I came across OSS-Fuzz about two months ago, I figured working through the list of projects with broken builds would be a great way to make an impact in the OSS world and learn about fuzzing. So here I am 🙂

Please feel free to add yourself there and I shall try to approve such PR. Generally, since I won't fix security issues, it's best to have them in the open glaringly so the community steps up to fix it.

Heard. We can figure out the specifics of how best to cross that bridge if and when we get to it.

While at it, I'd appreciate if you could add @EliahKagan to the CC list as well as he is definitely more competent in all of this, particularly in spotting security relevant issues.

Absolutely.

@EliahKagan, is that ok with you and does the email address on your profile work? (The issue tracker requires a Gmail address/Google account added to the OSS-Fuzz project config to allow authentication.)

Yes, please do. Putting it into a top-level fuzzing seems like a good idea, feel free to use any structure you like within it.

Sounds good. A simple directory with a flat layout should be enough.

Regarding the licensing of the existing files, maybe you can add to the 'licensing' section of the README that parts in fuzzing are under the apache-2 license, which shouldn't be relevant to any release that's made. It's probably worth validating that fuzzing doesn't get picked up by the publishing process though (which I expect it won't as it's include-list based).

👍

you will probably find the answer silly as there is no threat-model to speak of. [...]
Generally, I don't think that GitPython ever has a chance to be correct enough not to be exploitable by merit of being written in Python and relying on command invocation a lot. Further, some code does read local files and it's definitely not en-par with similar code in Git, so could fall pray to all kinds of issues.

As far as identifying interesting fuzz targets is concerned, that sounds like a pretty solid foundation to me!

So maybe the threat-model is: Don't use GitPython? Use gitoxide, and write everything in Rust 😅.

https://xkcd.com/2730/ 😅

1 reply

EliahKagan Apr 3, 2024
Collaborator

@EliahKagan, is that ok with you and does the email address on your profile work? (The issue tracker requires a Gmail address/Google account added to the OSS-Fuzz project config to allow authentication.)

Yes, that is okay with me. Either of the two email addresses given there are fine (and they are both Gmail addresses). Thanks!

Apr 12, 2024

DaveLak
Apr 12, 2024
Author

Initial migration PRs are ready for review:

Initial Migration of Fuzz Tests & Integration Scripts From the OSS-Fuzz Project Repo #1901
[gitpython]: Move Fuzz Tests & Configuration Upstream google/oss-fuzz#11803

0 replies

DaveLak · Apr 23, 2024

DaveLak
Apr 23, 2024
Author

Thanks again for all your support on this, @Byron, it's been quite fun!

Now that we're (hopefully) close to the finish line with the migration I wanted to ask for your thoughts on the optional CIFuzz GitHub action. I see that gitoxide already has it so I'd imagine you're familiar with it to some degree.

I think it could be a helpful addition to the PR checks, but I also recognize that it adds another moving part that may not be desired in this project.

Do you have any interest in getting it set up? If you think it's worthwhile, I'm happy to get a PR up.

3 replies

DaveLak Apr 29, 2024
Author

Incase you didn't see this:

ping @Byron

Byron Apr 29, 2024
Maintainer

Thanks for the ping, indeed I must have managed to archive the notification before acting on it.

Initially I just wanted to say something akin to "Yes, go for it", but then I was wondering if the additional compute will be useful at all. For example, in the previous years, I don't think I ever heard of OSSFuzz finding anything, and bugs it can probably find many.

If you think that the fuzz-tests will be extended to make that more likely, then please do feel free to add a fuzz action that uses the available time. Otherwise, it might be OK to skip it, but it's definitely your call.

DaveLak Apr 30, 2024
Author

Thanks, that makes sense to me.

I agree that until there is more fuzzer coverage, it's probably not the most useful use of resources; especially considering the unit tests for what is covered by the fuzzers do an effective job at regression testing in CI.

I'll skip adding CI for the time being and raise the idea again if/when it has more benefit to offer.

Apr 30, 2024

DaveLak
Apr 30, 2024
Author

I'm going to close this discussion as resolved now that the fuzzers have been migrated and the build issues have been resolved. New discussions can be opened as needed if anything else comes up.

Thanks for all the support on this!

0 replies

Search code, repositories, users, issues, pull requests...

Uh oh!

Continuing the OSS-Fuzz Integration Conversation #1889

Uh oh!

DaveLak Mar 31, 2024

Replies: 5 comments · 9 replies

Uh oh!

Byron Apr 2, 2024 Maintainer

Uh oh!

Byron Apr 2, 2024 Maintainer

Uh oh!

DaveLak Apr 2, 2024 Author

Uh oh!

Byron Apr 3, 2024 Maintainer

Uh oh!

DaveLak Apr 23, 2024 Author

Uh oh!

Byron Apr 23, 2024 Maintainer

Uh oh!

Uh oh!

DaveLak Apr 2, 2024 Author

Uh oh!

EliahKagan Apr 3, 2024 Collaborator

Uh oh!

DaveLak Apr 12, 2024 Author

Uh oh!

DaveLak Apr 23, 2024 Author

Uh oh!

DaveLak Apr 29, 2024 Author

Uh oh!

Byron Apr 29, 2024 Maintainer

Uh oh!

DaveLak Apr 30, 2024 Author

Uh oh!

DaveLak Apr 30, 2024 Author

DaveLak
Mar 31, 2024

Byron
Apr 2, 2024
Maintainer

Byron Apr 2, 2024
Maintainer

DaveLak Apr 2, 2024
Author

Byron Apr 3, 2024
Maintainer

DaveLak Apr 23, 2024
Author

Byron Apr 23, 2024
Maintainer

DaveLak
Apr 2, 2024
Author

EliahKagan Apr 3, 2024
Collaborator

DaveLak
Apr 12, 2024
Author

DaveLak
Apr 23, 2024
Author

DaveLak Apr 29, 2024
Author

Byron Apr 29, 2024
Maintainer

DaveLak Apr 30, 2024
Author

DaveLak
Apr 30, 2024
Author