File size: 1,408 Bytes
108647b 85b3895 108647b ab7defc d2f361b ab7defc 85b3895 ab7defc d2f361b ab7defc d2f361b ab7defc d2f361b ab7defc d2f361b ab7defc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
---
license: cc-by-sa-4.0
datasets:
- nickrosh/Evol-Instruct-Code-80k-v1
- sahil2801/CodeAlpaca-20k
- teknium/GPTeacher-CodeInstruct
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- code
- llama2
---
![image of llama engineer](https://i.imgur.com/JlhW0ri.png)
# Llama-Engineer-Evol-7B
This is a version of Meta's [chat instruction-tuned Llama 2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) further fine-tuned on over 80,000 coding samples.
The dataset is a combination of [Evol-Instruct-Code-80k-v1](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) from [nikrosh](https://huggingface.co/nickrosh), a replication of the Evol-Instruct-Code as described in the [WizardCoder](https://arxiv.org/pdf/2306.08568.pdf) paper, and [Teknium](https://huggingface.co/teknium)'s [GPTeacher](https://github.com/teknium1/GPTeacher/blob/main/Codegen/codegen-instruct.json). Special thanks to these folks for putting these datasets together.
## Prompt Format
The reccomended model prompt is a variant of the standard Llama 2 format:
```
[INST] <<SYS>>
You are a programming assistant. Always answer as helpfully as possible. Be direct in your response and get to the answer right away. Responses should be short.
<</SYS>>
{your prompt}[/INST]
```
## Next Steps
- Prune the dataset and possibly fine-tune for longer.
- Run benchmarks.
- Provide GGML and GPTQ. |