shreyasmeher commited on
Commit
3db687f
·
verified ·
1 Parent(s): f0bd365

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -135,6 +135,25 @@ This model is designed for:
135
  3. Not intended for operational security decisions
136
  4. Results should be interpreted with appropriate context
137
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
138
  ## Citation
139
  ```bibtex
140
  @misc{conflllama,
 
135
  3. Not intended for operational security decisions
136
  4. Results should be interpreted with appropriate context
137
 
138
+
139
+ ## Training Logs
140
+ <p align="center">
141
+ <img src="images/training-logs.png" alt="Training Logs" width="800"/>
142
+ </p>
143
+
144
+ The training logs show a successful training run with healthy convergence patterns:
145
+
146
+ **Loss & Learning Rate:**
147
+ - Loss decreases from 1.95 to ~0.90, with rapid initial improvement
148
+ - Learning rate uses warmup/decay schedule, peaking at ~1.5x10^-4
149
+
150
+ **Training Stability:**
151
+ - Stable gradient norms (0.4-0.6 range)
152
+ - Consistent GPU memory usage (~5800MB allocated, 7080MB reserved)
153
+ - Steady training speed (~3.5s/step) with brief interruption at step 800
154
+
155
+ The graphs indicate effective model training with good optimization dynamics and resource utilization. The loss vs. learning rate plot suggests optimal learning around 10^-4.
156
+
157
  ## Citation
158
  ```bibtex
159
  @misc{conflllama,