File size: 3,096 Bytes
89a586f
 
 
 
b2312cb
89a586f
b2312cb
89a586f
b2312cb
89a586f
 
 
 
 
b2312cb
89a586f
 
b2312cb
 
89a586f
b2312cb
89a586f
b2312cb
89a586f
 
 
 
 
 
b2312cb
89a586f
 
b2312cb
89a586f
 
b2312cb
89a586f
b2312cb
89a586f
b2312cb
 
89a586f
 
 
 
 
 
b2312cb
89a586f
 
 
 
 
b2312cb
 
89a586f
b2312cb
89a586f
b2312cb
89a586f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2312cb
 
89a586f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2312cb
89a586f
 
 
 
 
 
 
 
 
 
 
 
 
b2312cb
89a586f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2312cb
89a586f
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---
license: apache-2.0
pipeline_tag: text-generation
---
# πŸ¦™πŸ’» Safurai-Csharp-34B

πŸ“ [Article](https://www.safurai.com/blog/introducing-safurai-csharp)

<center><img src="https://media.discordapp.net/attachments/1071900237414801528/1165927645469478942/mrciffa_A_cartoon_samurai_wearing_a_black_jacket_as_a_chemistry_d4c17e16-567a-41da-9e0e-2902e93def2c.png?ex=6548a1bc&is=65362cbc&hm=5721b5c15d8f97374212970a7d01f17923ef5015d385230b8ae5542fd2d0df21&=&width=1224&height=1224" width="300"></center>

This is a [`codellama/CodeLlama-7b-hf`](https://huggingface.co/codellama/CodeLlama-7b-hf) model fine-tuned using QLoRA (4-bit precision) on the [`mlabonne/Evol-Instruct-Python-1k`](https://huggingface.co/datasets/mlabonne/Evol-Instruct-Python-1k).

## πŸ”§ Training

It was trained on an  in 1h 11m 44s with the following configuration file:

```yaml
base_model: codellama/CodeLlama-34b-hf
base_model_config: codellama/CodeLlama-34b-hf
model_type: LlamaForCausalLM
tokenizer_type: CodeLlamaTokenizer
is_llama_derived_model: true
hub_model_id: "Safurai/Evol-csharp-v1"

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
  - path: Safurai/EvolInstruct-csharp-16k-13B-Alpaca
    type: alpaca
dataset_prepared_path: last_run_prepared
val_set_size: 0.01
output_dir: ./qlora-out

sequence_len: 4096
sample_packing: true
pad_to_sequence_len: true

adapter: lora
lora_model_dir:
lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: codellama-csharp
wandb_entity:
wandb_watch:
wandb_run_id:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 3
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0003

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 40
eval_steps: 40
save_steps:
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<s>"
  eos_token: "</s>"
  unk_token: "<unk>"
```

Here are the loss curves:

![](https://i.imgur.com/zrBq01N.png)

It is mainly designed for experimental purposes, not for inference.

[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)

## πŸ’» Usage

``` python
# pip install transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/EvolCodeLlama-7b"
prompt = "Your csharp request"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

sequences = pipeline(
    f'{prompt}',
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=1000,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")
```