Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[Bug]: Incomplete words when using llama.cpp / llama-server as backend #10530

Copy link
Copy link
@MikeNatC

Description

@MikeNatC
Issue body actions

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

N.A.

RAGFlow image version

v0.20.5 full

Other environment information

Running on an Unraid server (Linux based) with two GPUS (an RTX 3090 and RTX 3090 TI) using the Docker images.

Actual behavior

I connected my llama-server (proxied using llama-swap) running GPT-OSS-20B to RagFlow using the OpenAI compatible API connection. When chatting with the model, I get thinking tokens which have in complete words although the final output seems to be fine.

See this image:
Image

However, when I use the same model on Ollama, the output is more coherent:

Image

I am not sure if there is something wrong with my settings. But when I use both Ollama and llama-server as backends in my OpenWeb UI chat interface, I have no problems and the outputs are generally consistent.

Expected behavior

I expect the behaviour to be more like the output of Ollama running the same model - GPT-OSS-20B.

Steps to reproduce

1. Add GPT-OSS-20B as an OpenAI Compatible API model.
2. Use GTP-OSS-20B (served via the OpenAI Compatible API option) as the chat model.
3. Chat and see the quality of the reasoning tokens.

Additional information

No response

dosubot

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working, pull request that fix bug.Something isn't working, pull request that fix bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.