PEFT
Safetensors
English
manupinasco commited on
Commit
fc47cc4
·
verified ·
1 Parent(s): f0e483d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -6
README.md CHANGED
@@ -33,26 +33,40 @@ The model is not designed for complex sentence structures, idiomatic expressions
33
 
34
  Users (both direct and downstream) should be aware that the model's accuracy may decline with more complex or less conventional sentence structures. It's recommended to use this model in conjunction with other tools for more comprehensive linguistic analysis.
35
 
 
36
  ## Training Details
37
 
 
 
38
  ### Training Data
39
 
40
  The model was trained on a curated dataset of simple English sentences annotated with Universal Dependency Parsing tags. The training data focused on ensuring high accuracy in syntactic role assignment.
41
 
42
  ### Training Procedure
43
 
 
44
 
 
45
 
46
- #### Training Hyperparameters
 
 
47
 
48
- - **Training regime:** Mixed precision (fp16)
49
 
 
 
 
 
 
 
 
 
 
 
50
  ## Evaluation
51
 
52
-
53
-
54
- ### Testing Data, Factors & Metrics
55
-
56
 
57
  #### Testing Data
58
 
 
33
 
34
  Users (both direct and downstream) should be aware that the model's accuracy may decline with more complex or less conventional sentence structures. It's recommended to use this model in conjunction with other tools for more comprehensive linguistic analysis.
35
 
36
+
37
  ## Training Details
38
 
39
+ The model was trained on a curated dataset of simple English sentences annotated with Universal Dependency Parsing tags. The dataset was sourced from the "manupinasco/syntax_analysis" dataset available on Hugging Face's Datasets Hub. The training data focused on ensuring high accuracy in syntactic role assignment, aiming to improve the model's ability to understand and generate syntactically correct responses.
40
+
41
  ### Training Data
42
 
43
  The model was trained on a curated dataset of simple English sentences annotated with Universal Dependency Parsing tags. The training data focused on ensuring high accuracy in syntactic role assignment.
44
 
45
  ### Training Procedure
46
 
47
+ The training procedure involved fine-tuning the unsloth/Meta-Llama-3.1-8B-Instruct model using a custom prompt format inspired by the Alpaca prompt template. The procedure included quantization to 4-bit to reduce memory usage, and mixed precision training to leverage GPU capabilities effectively.
48
 
49
+ Key components of the training process:
50
 
51
+ Model Quantization: 4-bit quantization was applied to the model to reduce VRAM usage while maintaining performance.
52
+ Gradient Checkpointing: Enabled using "unsloth" mode to save memory during training, which allowed handling longer sequences effectively.
53
+ Prompt Template: The model was trained using a structured prompt that provided instructions and expected responses, ensuring consistency and clarity in the tasks presented to the model.
54
 
55
+ #### Training Hyperparameters
56
 
57
+ - **Batch Size**: 2 per device
58
+ - **Gradient Accumulation**: 4 steps
59
+ - **Warmup Steps**: 5
60
+ - **Max Training Steps**: 60
61
+ - **Learning Rate**: 2e-4
62
+ - **Optimizer**: AdamW with 8-bit quantization
63
+ - **Weight Decay**: 0.01
64
+ - **LR Scheduler**: Linear
65
+ - **Mixed Precision**: fp16 (or bf16 if supported)
66
+ -
67
  ## Evaluation
68
 
69
+ The model's performance was evaluated using the test split from the same dataset. The evaluation focused on syntactic role assignment accuracy.
 
 
 
70
 
71
  #### Testing Data
72