Safurai
/

Safurai-Csharp-34B

Text Generation

Transformers

PyTorch

llama

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

davide221 commited on Oct 23, 2023

Commit

b2312cb

1 Parent(s): 89a586f

Update README.md

Browse files

Files changed (1) hide show

README.md +24 -28

README.md CHANGED Viewed

@@ -1,65 +1,62 @@
 ---
 license: apache-2.0
-datasets:
-- mlabonne/Evol-Instruct-Python-1k
 pipeline_tag: text-generation
 ---
-# 🦙💻 EvolCodeLlama-7b
-📝 [Article](https://medium.com/@mlabonne/a-beginners-guide-to-llm-fine-tuning-4bae7d4da672)
-<center><img src="https://i.imgur.com/5m7OJQU.png" width="300"></center>
 This is a [`codellama/CodeLlama-7b-hf`](https://huggingface.co/codellama/CodeLlama-7b-hf) model fine-tuned using QLoRA (4-bit precision) on the [`mlabonne/Evol-Instruct-Python-1k`](https://huggingface.co/datasets/mlabonne/Evol-Instruct-Python-1k).
 ## 🔧 Training
-It was trained on an RTX 3090 in 1h 11m 44s with the following configuration file:
 ```yaml
-base_model: codellama/CodeLlama-7b-hf
-base_model_config: codellama/CodeLlama-7b-hf
 model_type: LlamaForCausalLM
-tokenizer_type: LlamaTokenizer
 is_llama_derived_model: true
-hub_model_id: EvolCodeLlama-7b
 load_in_8bit: false
 load_in_4bit: true
 strict: false
 datasets:
-  - path: mlabonne/Evol-Instruct-Python-1k
     type: alpaca
 dataset_prepared_path: last_run_prepared
-val_set_size: 0.02
 output_dir: ./qlora-out
-adapter: qlora
-lora_model_dir:
-sequence_len: 2048
 sample_packing: true
 lora_r: 32
 lora_alpha: 16
 lora_dropout: 0.05
-lora_target_modules:
 lora_target_linear: true
 lora_fan_in_fan_out:
-wandb_project: axolotl
 wandb_entity:
 wandb_watch:
 wandb_run_id:
 wandb_log_model:
-gradient_accumulation_steps: 1
-micro_batch_size: 10
 num_epochs: 3
-optimizer: paged_adamw_32bit
 lr_scheduler: cosine
-learning_rate: 0.0002
 train_on_inputs: false
 group_by_length: false
@@ -75,9 +72,8 @@ logging_steps: 1
 xformers_attention:
 flash_attention: true
-warmup_steps: 100
-eval_steps: 0.01
-save_strategy: epoch
 save_steps:
 debug:
 deepspeed:
@@ -94,7 +90,7 @@ Here are the loss curves:
 ![](https://i.imgur.com/zrBq01N.png)
-It is mainly designed for educational purposes, not for inference.
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
@@ -108,7 +104,7 @@ import transformers
 import torch
 model = "mlabonne/EvolCodeLlama-7b"
-prompt = "Your prompt"
 tokenizer = AutoTokenizer.from_pretrained(model)
 pipeline = transformers.pipeline(
@@ -124,7 +120,7 @@ sequences = pipeline(
     top_k=10,
     num_return_sequences=1,
     eos_token_id=tokenizer.eos_token_id,
-    max_length=200,
 )
 for seq in sequences:
     print(f"Result: {seq['generated_text']}")

 ---
 license: apache-2.0
 pipeline_tag: text-generation
 ---
+# 🦙💻 Safurai-Csharp-34B
+📝 [Article](https://www.safurai.com/blog/introducing-safurai-csharp)
+<center><img src="https://media.discordapp.net/attachments/1071900237414801528/1165927645469478942/mrciffa_A_cartoon_samurai_wearing_a_black_jacket_as_a_chemistry_d4c17e16-567a-41da-9e0e-2902e93def2c.png?ex=6548a1bc&is=65362cbc&hm=5721b5c15d8f97374212970a7d01f17923ef5015d385230b8ae5542fd2d0df21&=&width=1224&height=1224" width="300"></center>
 This is a [`codellama/CodeLlama-7b-hf`](https://huggingface.co/codellama/CodeLlama-7b-hf) model fine-tuned using QLoRA (4-bit precision) on the [`mlabonne/Evol-Instruct-Python-1k`](https://huggingface.co/datasets/mlabonne/Evol-Instruct-Python-1k).
 ## 🔧 Training
+It was trained on an  in 1h 11m 44s with the following configuration file:
 ```yaml
+base_model: codellama/CodeLlama-34b-hf
+base_model_config: codellama/CodeLlama-34b-hf
 model_type: LlamaForCausalLM
+tokenizer_type: CodeLlamaTokenizer
 is_llama_derived_model: true
+hub_model_id: "Safurai/Evol-csharp-v1"
 load_in_8bit: false
 load_in_4bit: true
 strict: false
 datasets:
+  - path: Safurai/EvolInstruct-csharp-16k-13B-Alpaca
     type: alpaca
 dataset_prepared_path: last_run_prepared
+val_set_size: 0.01
 output_dir: ./qlora-out
+sequence_len: 4096
 sample_packing: true
+pad_to_sequence_len: true
+adapter: lora
+lora_model_dir:
 lora_r: 32
 lora_alpha: 16
 lora_dropout: 0.05
 lora_target_linear: true
 lora_fan_in_fan_out:
+wandb_project: codellama-csharp
 wandb_entity:
 wandb_watch:
 wandb_run_id:
 wandb_log_model:
+gradient_accumulation_steps: 4
+micro_batch_size: 2
 num_epochs: 3
+optimizer: adamw_bnb_8bit
 lr_scheduler: cosine
+learning_rate: 0.0003
 train_on_inputs: false
 group_by_length: false
 xformers_attention:
 flash_attention: true
+warmup_steps: 40
+eval_steps: 40
 save_steps:
 debug:
 deepspeed:
 ![](https://i.imgur.com/zrBq01N.png)
+It is mainly designed for experimental purposes, not for inference.
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 import torch
 model = "mlabonne/EvolCodeLlama-7b"
+prompt = "Your csharp request"
 tokenizer = AutoTokenizer.from_pretrained(model)
 pipeline = transformers.pipeline(
     top_k=10,
     num_return_sequences=1,
     eos_token_id=tokenizer.eos_token_id,
+    max_length=1000,
 )
 for seq in sequences:
     print(f"Result: {seq['generated_text']}")