Update llama_cpp_python version to 0.2.75#1161
Update llama_cpp_python version to 0.2.75#1161mergify[bot] merged 1 commit intoinstructlab:maininstructlab/instructlab:mainfrom alimaredia:update-llama-cpp-python-veralimaredia/instructlab:update-llama-cpp-python-verCopy head branch name to clipboard
Conversation
|
@alimaredia can we fix the lint error in this as well? Not sure how that got in |
|
@nathan-weinberg Wouldn't that impact the backport if we just want to backport requirements.txt. If so the linting should be addressed in a separate PR. |
|
Ran the e2e job here: https://github.com/instructlab/instructlab/actions/runs/9087518967 |
1 similar comment
|
Ran the e2e job here: https://github.com/instructlab/instructlab/actions/runs/9087518967 |
|
@nathan-weinberg #1162 fixes the linting issues. Once that is merged I can rebase my PR and the linting test should pass. |
|
@Mergifyio rebase |
✅ Branch has been successfully rebased |
1f595e5 to
9c85588
Compare
|
@Mergifyio rebase |
✅ Branch has been successfully rebased |
9c85588 to
f114766
Compare
|
the functional test failure looks like a real problem |
|
Could you try with a larger max ctx size? 4096 might be too small. |
edf937e to
f114766
Compare
09c15a0 to
b299786
Compare
src/instructlab/chat/chat.py
Outdated
| @@ -391,6 +392,10 @@ def start_prompt(self, logger, content=None, box=True): | ||
| ) | ||
| self.info["messages"].pop() |
There was a problem hiding this comment.
this is trimming the newest message and also recent llama-cpp versions don't seem to tolerate changing the message list size
There was a problem hiding this comment.
Ignore the confusion about "newest" vs "latest" -- I read a comment out-of-place.
The main thing here though is that we still get InternalError happening when we try to shorten messages this way and I think we're going to make that a separate issue.
b0beb2b to
da93591
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
da93591 to
90feeb3
Compare
- Adjust test_ctx_size() - handlle openai.InternalServerError when chatting Signed-off-by: Ali Maredia <amaredia@redhat.com>
90feeb3 to
f24d0d7
Compare
|
@russellb @markstur @tiran @nathan-weinberg What started as just trying to bump the version of llama_cpp_python (because of https://github.com/instructlab/instructlab/security/dependabot/1) turned into realizing that our trimming code in chat doesn't trim the way we'd expect it to anymore. When max-ctx-size goes from 25 -> 55 , certain prompts throw a For now I think just handling |
@markstur could you go into detail about this more or send me a reproducer for what you're talking about? That line should be trimming the last remaining message and then raising a |
russellb
left a comment
There was a problem hiding this comment.
Thank you for your diligent efforts on this!
As discussed, there's more to dig into here to figure out why it's responding with an internal server error. Something unexpected is still happening on the server side. Please file an issue to track down the source of the internal server error at some point.
|
I think this works reasonably well when the context is 512 (maybe even 256). It's the real short context tests where the internal error is such a problem. I suspect llama-cpp-python needs to fix something with the batch size vs context size, but haven't been able to figure out why I only reproduce the problem with small contexts (so far). We probably should merge this soon as the lesser of evils. Wondering about a few things though:
|
|
Can we move forward with this or are we still waiting for all the requested reviews? Thank |
|
I’m going to let this merge. If anyone has follow ups let’s file an issue to make sure it doesn’t get lost. |
|
@Mergifyio backport release-v0.15 |
✅ Backports have been createdDetails
|
…-1161 Update llama_cpp_python version to 0.2.75 (backport #1161)
Changes
Which issue is resolved by this Pull Request:
Resolves # https://github.com/instructlab/instructlab/security/dependabot/1
Description of your changes:
Testing needs to be done on M3 Macs to ensure abetlen/llama-cpp-python#1286 doesn't still occur.