Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Bug: convert-hf-to-gguf.py on Gemma model ValueError: Duplicated key name 'tokenizer.chat_template' #7923

Copy link
Copy link
Closed
@bartowski1182

Description

@bartowski1182
Issue body actions

What happened?

When trying to convert

https://huggingface.co/SakanaAI/DiscoPOP-zephyr-7b-gemma/

I get the error in the title, but it's only defined a single time in tokenizer_config.json:

https://huggingface.co/SakanaAI/DiscoPOP-zephyr-7b-gemma/blob/main/tokenizer_config.json#L59

Verified locally with cat *.json | grep chat_template and I only get the one result

Is it somehow trying to load it twice?

Looks like when Gemma is initialized, it runs _set_vocab_sentencepiece(), which runs special_vocab.add_to_gguf (which pulls in the chat_template), and then it also again runs special_vocab.add_to_gguf

but that would mean it's been broken since April 16..

#6689

Name and Version

b3145 ubuntu 22.04

What operating system are you seeing the problem on?

Linux

Relevant log output

INFO:hf-to-gguf:Loading model: DiscoPOP-zephyr-7b-gemma
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.vocab:Setting special token type bos to 2
INFO:gguf.vocab:Setting special token type eos to 1
INFO:gguf.vocab:Setting special token type unk to 3
INFO:gguf.vocab:Setting special token type pad to 0
INFO:gguf.vocab:Setting add_bos_token to False
INFO:gguf.vocab:Setting add_eos_token to False
INFO:gguf.vocab:Setting chat_template to {% if messages[0]['role'] == 'user' or messages[0]['role'] == 'system' %}{{ bos_token }}{% endif %}{% for message in messages %}{{ '<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% elif messages[-1]['role'] == 'assistant' %}{{ eos_token }}{% endif %}
INFO:gguf.vocab:Setting special token type prefix to 67
INFO:gguf.vocab:Setting special token type suffix to 69
INFO:gguf.vocab:Setting special token type middle to 68
WARNING:gguf.vocab:No handler for special token type fsep with id 70 - skipping
INFO:gguf.vocab:Setting special token type eot to 107
INFO:gguf.vocab:Setting chat_template to {% if messages[0]['role'] == 'user' or messages[0]['role'] == 'system' %}{{ bos_token }}{% endif %}{% for message in messages %}{{ '<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% elif messages[-1]['role'] == 'assistant' %}{{ eos_token }}{% endif %}
Traceback (most recent call last):
  File "/llama.cpp/convert-hf-to-gguf.py", line 2882, in <module>
    main()
  File "/llama.cpp/convert-hf-to-gguf.py", line 2867, in main
    model_instance.set_vocab()
  File "/llama.cpp/convert-hf-to-gguf.py", line 2251, in set_vocab
    special_vocab.add_to_gguf(self.gguf_writer)
  File "/llama.cpp/gguf-py/gguf/vocab.py", line 73, in add_to_gguf
    gw.add_chat_template(self.chat_template)
  File "/llama.cpp/gguf-py/gguf/gguf_writer.py", line 565, in add_chat_template
    self.add_string(Keys.Tokenizer.CHAT_TEMPLATE, value)
  File "/llama.cpp/gguf-py/gguf/gguf_writer.py", line 206, in add_string
    self.add_key_value(key, val, GGUFValueType.STRING)
  File "/llama.cpp/gguf-py/gguf/gguf_writer.py", line 166, in add_key_value
    raise ValueError(f'Duplicated key name {key!r}')
ValueError: Duplicated key name 'tokenizer.chat_template'
stepancheg

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingSomething isn't workinglow severityUsed to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.