Closed
Description
Prerequisites
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Running Functionary 2.4 Small.
The response from the API server endpoint should contain correct values for completion_tokens
and total_tokens
.
Current Behavior
completion_tokens
is always 1. E.g.:
"usage": {
"prompt_tokens": 507,
"completion_tokens": 1,
"total_tokens": 508
}
Environment and Context
-
MacOS 14.4.1
MBP M3 Max -
Darwin MacBook-Pro 23.4.0 Darwin Kernel Version 23.4.0: Fri Mar 15 00:12:37 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6031 arm64
-
Python 3.11.5
Failure Information (for bugs)
completion_tokens
always 1 with API server
Steps to Reproduce
Run
python3 -m llama_cpp.server --model "./functionary/functionary-small-v2.4.Q4_0.gguf" --chat_format functionary-v2 --hf_pretrained_model_name_or_path "./functionary" --n_gpu_layers -1
Then send an Open AI Tools calling request to the endpoint, something like:
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "user",
"content": "What is the weather like in Boston?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'
The response contains the wrong values for usage
:
"usage" : {
"completion_tokens" : 1,
"prompt_tokens" : 187,
"total_tokens" : 188
}
Metadata
Metadata
Assignees
Labels
No labels