|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- Open-Orca/OpenOrca |
|
language: |
|
- en |
|
--- |
|
|
|
# Model Details |
|
* Model Description: This model is test for data ordering. |
|
* Developed by: Juhwan Lee |
|
* Model Type: Large Language Model |
|
|
|
# Model Architecture |
|
This model is based on Mistral-7B-v0.1. We fine-tuning this model for data ordering task. |
|
|
|
Mistral-7B-v0.1 is a transformer model, with the following architecture choices: |
|
|
|
* Grouped-Query Attention |
|
* Sliding-Window Attention |
|
* Byte-fallback BPE tokenizer |
|
|
|
# Dataset |
|
We random sample Open-Orca dataset. (We finetune the 100,000 dataset) |
|
|
|
# Guthub |
|
https://github.com/trailerAI |
|
|
|
# License |
|
Apache License 2.0 |