Text Generation
Transformers
Safetensors
Spanish
gemma
Legal
Law
Peru
Leyes Juridicas
conversational
text-generation-inference
Inference Endpoints
daqc commited on
Commit
13b4e5e
·
verified ·
1 Parent(s): 6eaf1af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -139,7 +139,7 @@ QLoRA (Quantization LoRA) was employed to optimize the model's computational eff
139
  - **bias:** Set to "none" to exclude bias terms from adaptation, simplifying the model architecture.
140
  - **lora_dropout:** Reduced to 0.025 from the default 0.05, controlling the dropout rate during adaptation.
141
  - **task_type:** Configured as "CAUSAL_LM" to indicate the task type of the language model.
142
- -
143
  ```python
144
  config = LoraConfig(
145
  r=8,
@@ -163,20 +163,20 @@ After fine-tuning, the LoRA-adjusted weights were merged back with the base Gemm
163
 
164
  During the training process, Wandb (Weights & Biases) was used for comprehensive logging and visualization of key metrics. Wandb's powerful tracking capabilities enabled real-time monitoring of training progress, evaluation metrics, and model performance. Through interactive dashboards and visualizations, Wandb facilitated deep insights into the training dynamics, allowing for efficient model optimization and debugging. This logging integration with Wandb enhances transparency, reproducibility, and collaboration among researchers and practitioners.
165
 
166
- eval/loss:1.1386919021606443
167
- eval/runtime:44.2153
168
- eval/samples_per_second:8.707
169
- eval/steps_per_second:8.707
170
- train/epoch:49.62
171
- train/global_step:4,850
172
- train/grad_norm:3.5548949241638184
173
- train/learning_rate:0
174
- train/loss:0.8596
175
- train/total_flos:236,149,029,419,876,350
176
- train/train_loss:1.105836234535139
177
- train/train_runtime:13,237.4947
178
- train/train_samples_per_second:5.9
179
- train/train_steps_per_second:0.366
180
 
181
 
182
  ## Environmental impact
 
139
  - **bias:** Set to "none" to exclude bias terms from adaptation, simplifying the model architecture.
140
  - **lora_dropout:** Reduced to 0.025 from the default 0.05, controlling the dropout rate during adaptation.
141
  - **task_type:** Configured as "CAUSAL_LM" to indicate the task type of the language model.
142
+
143
  ```python
144
  config = LoraConfig(
145
  r=8,
 
163
 
164
  During the training process, Wandb (Weights & Biases) was used for comprehensive logging and visualization of key metrics. Wandb's powerful tracking capabilities enabled real-time monitoring of training progress, evaluation metrics, and model performance. Through interactive dashboards and visualizations, Wandb facilitated deep insights into the training dynamics, allowing for efficient model optimization and debugging. This logging integration with Wandb enhances transparency, reproducibility, and collaboration among researchers and practitioners.
165
 
166
+ - eval/loss:1.1386919021606443
167
+ - eval/runtime:44.2153
168
+ - eval/samples_per_second:8.707
169
+ - eval/steps_per_second:8.707
170
+ - train/epoch:49.62
171
+ - train/global_step:4,850
172
+ - train/grad_norm:3.5548949241638184
173
+ - train/learning_rate:0
174
+ - train/loss:0.8596
175
+ - train/total_flos:236,149,029,419,876,350
176
+ - train/train_loss:1.105836234535139
177
+ - train/train_runtime:13,237.4947
178
+ - train/train_samples_per_second:5.9
179
+ - train/train_steps_per_second:0.366
180
 
181
 
182
  ## Environmental impact