How can you convert the pth into the gguf model?

#1
by mzwing - opened

Hi, thanks for your awesome work!

I want to convert more comprehensive quantization varients for the original model, but failed to find a way to deal with the pth file format. What's worse, the convert_rwkv_checkpoint_to_hf.py script provided by transformers also complained this:

Traceback (most recent call last):
  File "/home/mzwing/AI/runner/tools/convert_rwkv_checkpoint_to_hf.py", line 201, in <module>
    convert_rmkv_checkpoint_to_hf_format(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        args.repo_id,
        ^^^^^^^^^^^^^
    ...<5 lines>...
        model_name=args.model_name,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/mzwing/AI/runner/tools/convert_rwkv_checkpoint_to_hf.py", line 151, in convert_rmkv_checkpoint_to_hf_format
    torch.save({k: v.cpu().clone() for k, v in state_dict.items()}, os.path.join(output_dir, shard_file))
                                               ^^^^^^^^^^^^^^^^
AttributeError: 'Tensor' object has no attribute 'items'. Did you mean: 'item'?

If I ignore the error and continue converting it to gguf, llama.cpp's convert_hf_to_gguf.pywill throw this:

Traceback (most recent call last):
  File "/home/mzwing/AI/repos/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
    main()
    ~~~~^^
  File "/home/mzwing/AI/repos/llama.cpp/./convert_hf_to_gguf.py", line 5112, in main
    model_architecture = hparams["architectures"][0]
                         ~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'architectures'

So, how can you convert the pth into the gguf model? Could you please help me? Thanks a lot!

Owner

Sorry for the late reply😥. You need to use the pth_to_hf.py file to convert the pth file to hf format, and then convert the hf file to gguf. Below is the content of pth_to_hf.py❤

# Convert the model for the pytoch_model.bin
import torch
 
SOURCE_MODEL="./v6-FinchX-14B-pth/rwkv-14b-final.pth"
TARGET_MODEL="./v6-Finch-14B-HF/pytorch_model.bin"
 
# delete target model
import os
if os.path.exists(TARGET_MODEL):
    os.remove(TARGET_MODEL)
 
model = torch.load(SOURCE_MODEL, mmap=True, map_location='cpu')
# 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
# Rename all the keys, to include "rwkv."
new_model = {}
for key in model.keys():
 
    # If the keys start with "blocks"
    if key.startswith("blocks."):
        new_key = "rwkv." + key
        # Replace .att. with .attention.
        new_key = new_key.replace(".att.", ".attention.")
        # Replace .ffn. with .feed_forward.
        new_key = new_key.replace(".ffn.", ".feed_forward.")
        # Replace `0.ln0.` with `0.pre_ln.`
        new_key = new_key.replace("0.ln0.", "0.pre_ln.")
    else:
        # No rename needed
        new_key = key
 
        # Rename `emb.weight` to `rwkv.embeddings.weight`
        if key == "emb.weight":
            new_key = "rwkv.embeddings.weight"
 
        # Rename the `ln_out.x` to `rwkv.ln_out.x
        if key.startswith("ln_out."):
            new_key = "rwkv." + key
 
    print("Renaming key:", key, "--to-->", new_key)
    new_model[new_key] = model[key]
 
# Save the new model
print("Saving the new model to:", TARGET_MODEL)
torch.save(new_model, TARGET_MODEL)

Thanks for your reply!

However, if I use this script to convert the pth file, the llama.cpp's convert_hf_to_gguf.py will complain that:

INFO:hf-to-gguf:Loading model: RWKV6-3B-Chn-UnlimitedRP-mini-chat-HF
Traceback (most recent call last):
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
    main()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5108, in main
    hparams = Model.load_hparams(dir_model)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 468, in load_hparams
    with open(dir_model / "config.json", "r", encoding="utf-8") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '../RWKV6-3B-Chn-UnlimitedRP-mini-chat-HF/config.json'

BTW, does this script come from https://rwkv.cn/llamacpp#appendix-code? I believe it does convert the model into HF format, but it forgets to save the model info to config.json, etc.

Owner

The required config.json and other files are in this URL: https://huggingface.co/RWKV/rwkv-6-world-3b

yes! This pyScript comes from https://rwkv.cn/llamacpp#appendix-code

Oh thx a lot! I will have a try later. ❤️

This time when I run the convert_hf_to_gguf.py I encountered a new traceback 😢:

> python ./convert_hf_to_gguf.py --outtype f16 --outfile ../RWKV6-3B-Chn-UnlimitedRP-mini-chat-GGUF.F16.gguf ../RWKV6-3B-Chn-UnlimitedRP-mini-chat-HF/
INFO:hf-to-gguf:Loading model: RWKV6-3B-Chn-UnlimitedRP-mini-chat-HF
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model part 'pytorch_model.bin'
INFO:hf-to-gguf:token_embd.weight,                    torch.bfloat16 --> F16, shape = {2560, 65536}
............
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
    main()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5134, in main
    model_instance.write()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 440, in write
    self.prepare_metadata(vocab_only=False)
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 433, in prepare_metadata
    self.set_vocab()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 3330, in set_vocab
    assert (self.dir_model / "rwkv_vocab_v20230424.txt").is_file()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

What can I do next?

OK I think I find the rwkv_vocab_v20230424.txt is actually here: https://huggingface.co/RWKV/v6-Finch-1B6-HF/blob/main/rwkv_vocab_v20230424.txt

However, I found that I still cannot convert it successfully. It now complains a new traceback:

Traceback (most recent call last):
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
    main()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5134, in main
    model_instance.write()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 440, in write
    self.prepare_metadata(vocab_only=False)
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 433, in prepare_metadata
    self.set_vocab()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 3340, in set_vocab
    assert len(parts) >= 3
           ^^^^^^^^^^^^^^^
AssertionError

I found a GitHub repo https://github.com/BBuf/RWKV-World-HF-Tokenizer can help convert!

However, it still requires some manual editing (see its README.md for more details).

I think I will fork it to make some improvements.

BTW, if want to use the repo, maybe we should also set transformers==4.46.3, or the script will just refuse to work 😢...

And the file I mentioned here is still needed to be put in the HF folder.

(Looks like a bit complicated...)

Owner

OK I think I find the rwkv_vocab_v20230424.txt is actually here: https://huggingface.co/RWKV/v6-Finch-1B6-HF/blob/main/rwkv_vocab_v20230424.txt

However, I found that I still cannot convert it successfully. It now complains a new traceback:

Traceback (most recent call last):
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
    main()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5134, in main
    model_instance.write()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 440, in write
    self.prepare_metadata(vocab_only=False)
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 433, in prepare_metadata
    self.set_vocab()
  File "/content/llama.cpp/./convert_hf_to_gguf.py", line 3340, in set_vocab
    assert len(parts) >= 3
           ^^^^^^^^^^^^^^^
AssertionError

You can try using the files in this repo 🤔: https://huggingface.co/RWKV/v6-Finch-3B-HF

Owner

I found a GitHub repo https://github.com/BBuf/RWKV-World-HF-Tokenizer can help convert!

However, it still requires some manual editing (see its README.md for more details).

I think I will fork it to make some improvements.

BTW, if want to use the repo, maybe we should also set transformers==4.46.3, or the script will just refuse to work 😢...

And the file I mentioned here is still needed to be put in the HF folder.

(Looks like a bit complicated...)

Since I don't have LLM related expertise, I don't quite understand how it works, sorry

You can try using the files in this repo 🤔: https://huggingface.co/RWKV/v6-Finch-3B-HF

It worked!

Looks like both of the converting results are the same!

Thanks for your patient reply!!!!!

My result: https://huggingface.co/mzwing/RWKV6-3B-Chn-UnlimitedRP-mini-chat-GGUF/tree/main

mzwing changed discussion status to closed
Owner

You can try using the files in this repo 🤔: https://huggingface.co/RWKV/v6-Finch-3B-HF

It worked!

Looks like both of the converting results are the same!

Thanks for your patient reply!!!!!

My result: https://huggingface.co/mzwing/RWKV6-3B-Chn-UnlimitedRP-mini-chat-GGUF/tree/main

😊I'm glad I solved your problem.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment