macadeliccc commited on
Commit
9f2ff65
·
verified ·
1 Parent(s): 1da0eed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -189,7 +189,7 @@ special_tokens:
189
 
190
  </details><br>
191
 
192
- # outputs/lora-out
193
 
194
  This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on the None dataset.
195
  It achieves the following results on the evaluation set:
@@ -197,7 +197,7 @@ It achieves the following results on the evaluation set:
197
 
198
  ## Model description
199
 
200
- More information needed
201
 
202
  ## Intended uses & limitations
203
 
@@ -209,6 +209,8 @@ More information needed
209
 
210
  ## Training procedure
211
 
 
 
212
  ### Training hyperparameters
213
 
214
  The following hyperparameters were used during training:
 
189
 
190
  </details><br>
191
 
192
+ # magistrate-3.2-3b-base
193
 
194
  This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on the None dataset.
195
  It achieves the following results on the evaluation set:
 
197
 
198
  ## Model description
199
 
200
+ This is a base model trained on US Supreme Court proceedings, US federal code and regulations. This is a proof of concept for a larger model as it can be very expensive to finetune something like a 70B.
201
 
202
  ## Intended uses & limitations
203
 
 
209
 
210
  ## Training procedure
211
 
212
+ Spectrum top 35% fine tune. Methodology based on Cohere's paper: To Code, or Not To Code? Exploring Impact of Code in Pre-training
213
+
214
  ### Training hyperparameters
215
 
216
  The following hyperparameters were used during training: