|
--- |
|
license: cc-by-nc-sa-4.0 |
|
pipeline_tag: text-generation |
|
language: |
|
- en |
|
tags: |
|
- finetuned |
|
--- |
|
|
|
# Model Card for ZoyLLM-7B-SlimOrca |
|
|
|
The ZoyLLM-7B-SlimOrca Large Language Model (LLM) is a LoRA-finetuned generative text model with Mistral-7B-v0.1 is the base model. |
|
Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested. |
|
|
|
## Model Architecture |
|
|
|
ZoyLLM-7B-SlimOrca is a transformer model, with the following architecture choices: |
|
- Grouped-Query Attention |
|
- Sliding-Window Attention |
|
- Byte-fallback BPE tokenizer |
|
|
|
## Datasets |
|
- Self-introduction (20 samples) |
|
- SlimOrca (100k samples random sampled) |
|
- EverythingLM v3 |
|
|
|
## Template |
|
We finetuned the model using a template similar to the dolphin chat template |
|
``` |
|
<|im_start|>system |
|
{system}<|im_end|> |
|
<|im_start|>user |
|
{prompt}<|im_end|> |
|
<|im_start|>assistant |
|
``` |
|
|
|
## Troubleshooting |
|
|
|
- If you see the following error: |
|
``` |
|
KeyError: 'mistral' |
|
``` |
|
- Or: |
|
``` |
|
NotImplementedError: Cannot copy out of meta tensor; no data! |
|
``` |
|
|
|
Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer. |
|
|
|
## The Zoy AI Team |
|
|
|
Pham Tung Lam, Nguyen Duc Nhan. |