Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Refactor Functionary chat_handler to use HF AutoTokenizer #1075

Copy link
Copy link
Closed
@jeffrey-fong

Description

@jeffrey-fong
Issue body actions

Is your feature request related to a problem? Please describe.
Currently, llama-cpp's tokenizer has two differences/problems as compared to HF's AutoTokenizer.

  1. model.tokenize (llama-cpp-python) != tokenizer.encode (HF)
    llama-cpp's tokenizer seems to always generate a different token from HF's autotokenizer for the token right after every special token.
  2. model.detokenize returns empty string when trying to convert a special token back to text

I am currently working on implementing the chat handlers for Functionary-7b-v1.4 and all v2 models as mentioned in this issue. However, all the models use added special tokens in the prompt template and as stopping tokens. This results in suboptimal generation (due to problem 1) and the inability to stop generation (problem 2).

Describe the solution you'd like
Refactor the current Functionary chat_handler to use HF AutoTokenizer instead of llama-cpp-python's tokenizer. We can then directly use the jinja chat template in the various Functionary models. This also avoids any discrepancies in terms of tokenization.

Describe alternatives you've considered
Fixing the tokenizer issue in llama.cpp directly but lack the knowledge of the details of that codebase.

Additional context
NA

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.