ELhadratiOth commited on
Commit
0bcc47a
·
verified ·
1 Parent(s): 5990487

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -49
README.md CHANGED
@@ -1,49 +1,83 @@
1
- ---
2
- library_name: peft
3
- license: apache-2.0
4
- base_model: Qwen/Qwen2.5-1.5B-Instruct
5
- tags:
6
- - llama-factory
7
- - lora
8
- - generated_from_trainer
9
- model-index:
10
- - name: models
11
- results: []
12
- ---
13
-
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
-
17
- # models
18
-
19
- This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the darija_finetune_train dataset.
20
-
21
-
22
-
23
- ## Training procedure
24
-
25
- ### Training hyperparameters
26
-
27
- The following hyperparameters were used during training:
28
- - learning_rate: 0.0001
29
- - train_batch_size: 1
30
- - eval_batch_size: 1
31
- - seed: 42
32
- - distributed_type: multi-GPU
33
- - num_devices: 2
34
- - gradient_accumulation_steps: 4
35
- - total_train_batch_size: 8
36
- - total_eval_batch_size: 2
37
- - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
38
- - lr_scheduler_type: cosine
39
- - lr_scheduler_warmup_ratio: 0.1
40
- - num_epochs: 10.0
41
-
42
-
43
- ### Framework versions
44
-
45
- - PEFT 0.12.0
46
- - Transformers 4.49.0
47
- - Pytorch 2.5.1+cu121
48
- - Datasets 3.2.0
49
- - Tokenizers 0.21.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Darija-English Translator
2
+
3
+ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the `darija_finetune_train` dataset. It is designed to translate text from Moroccan Darija (a dialect of Arabic) to English.
4
+
5
+ ## Model Details
6
+
7
+ - **Library**: PEFT
8
+ - **License**: Apache 2.0
9
+ - **Base Model**: Qwen/Qwen2.5-1.5B-Instruct
10
+ - **Tags**: `llama-factory`, `lora`, `generated_from_trainer`
11
+
12
+ ## How to Use
13
+
14
+ You can load and use the model with the `transformers` library:
15
+
16
+ ```python
17
+ from transformers import AutoModelForCausalLM, AutoTokenizer
18
+ import torch
19
+
20
+ # Define model and tokenizer
21
+ base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
22
+ finetuned_model_id = "ELhadratiOth/darija-english-translater"
23
+ device = "cuda" # Change to "cpu" if GPU is not available
24
+
25
+ model = AutoModelForCausalLM.from_pretrained(
26
+ base_model_id,
27
+ device_map="auto",
28
+ torch_dtype=None
29
+ )
30
+
31
+ # Load the fine-tuned adapter
32
+ model.load_adapter(finetuned_model_id)
33
+
34
+ # Load tokenizer
35
+ tokenizer = AutoTokenizer.from_pretrained(base_model_id)
36
+
37
+ def translate_darija(text):
38
+ messages = [
39
+ {"role": "system", "content": "You are a professional NLP data parser. Follow the provided task and output scheme for consistency."},
40
+ {"role": "user", "content": f"## Task:\n{text}\n\n## English Translation:"}
41
+ ]
42
+
43
+ text_input = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
44
+ model_inputs = tokenizer([text_input], return_tensors="pt").to(device)
45
+
46
+ generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=1024, do_sample=False, temperature=0.8)
47
+ translation = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
48
+
49
+ return translation
50
+
51
+ # Example usage
52
+ query = "Your Darija text here"
53
+ response = translate_darija(query)
54
+ print(response)
55
+ ```
56
+
57
+ ## Training Details
58
+
59
+ ### Hyperparameters
60
+ - **Learning Rate**: 0.0001
61
+ - **Batch Size**:
62
+ - Train: 1
63
+ - Eval: 1
64
+ - **Seed**: 42
65
+ - **Distributed Training**: Multi-GPU
66
+ - **Number of Devices**: 2
67
+ - **Gradient Accumulation Steps**: 4
68
+ - **Total Train Batch Size**: 8
69
+ - **Total Eval Batch Size**: 2
70
+ - **Optimizer**: AdamW (betas=(0.9,0.999), epsilon=1e-08)
71
+ - **LR Scheduler**: Cosine
72
+ - **Warmup Ratio**: 0.1
73
+ - **Epochs**: 10
74
+
75
+ ### Framework Versions
76
+ - PEFT: 0.12.0
77
+ - Transformers: 4.49.0
78
+ - PyTorch: 2.5.1+cu121
79
+ - Datasets: 3.2.0
80
+ - Tokenizers: 0.21.0
81
+
82
+
83
+