somosnlp
/

kuntur-peru-legal-es-gemma-2b-it-merged

Text Generation

Leyes Juridicas

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

daqc commited on Apr 11, 2024

Commit

13b4e5e

·

verified ·

1 Parent(s): 6eaf1af

Update README.md

Files changed (1) hide show

README.md +15 -15

README.md CHANGED Viewed

@@ -139,7 +139,7 @@ QLoRA (Quantization LoRA) was employed to optimize the model's computational eff
     - **bias:** Set to "none" to exclude bias terms from adaptation, simplifying the model architecture.
     - **lora_dropout:** Reduced to 0.025 from the default 0.05, controlling the dropout rate during adaptation.
     - **task_type:** Configured as "CAUSAL_LM" to indicate the task type of the language model.
-    -
     ```python
         config = LoraConfig(
         r=8,
@@ -163,20 +163,20 @@ After fine-tuning, the LoRA-adjusted weights were merged back with the base Gemm
 During the training process, Wandb (Weights & Biases) was used for comprehensive logging and visualization of key metrics. Wandb's powerful tracking capabilities enabled real-time monitoring of training progress, evaluation metrics, and model performance. Through interactive dashboards and visualizations, Wandb facilitated deep insights into the training dynamics, allowing for efficient model optimization and debugging. This logging integration with Wandb enhances transparency, reproducibility, and collaboration among researchers and practitioners.
-  eval/loss:1.1386919021606443
-  eval/runtime:44.2153
-  eval/samples_per_second:8.707
-  eval/steps_per_second:8.707
-  train/epoch:49.62
-  train/global_step:4,850
-  train/grad_norm:3.5548949241638184
-  train/learning_rate:0
-  train/loss:0.8596
-  train/total_flos:236,149,029,419,876,350
-  train/train_loss:1.105836234535139
-  train/train_runtime:13,237.4947
-  train/train_samples_per_second:5.9
-  train/train_steps_per_second:0.366
 ## Environmental impact

     - **bias:** Set to "none" to exclude bias terms from adaptation, simplifying the model architecture.
     - **lora_dropout:** Reduced to 0.025 from the default 0.05, controlling the dropout rate during adaptation.
     - **task_type:** Configured as "CAUSAL_LM" to indicate the task type of the language model.
     ```python
         config = LoraConfig(
         r=8,
 During the training process, Wandb (Weights & Biases) was used for comprehensive logging and visualization of key metrics. Wandb's powerful tracking capabilities enabled real-time monitoring of training progress, evaluation metrics, and model performance. Through interactive dashboards and visualizations, Wandb facilitated deep insights into the training dynamics, allowing for efficient model optimization and debugging. This logging integration with Wandb enhances transparency, reproducibility, and collaboration among researchers and practitioners.
+ - eval/loss:1.1386919021606443
+ - eval/runtime:44.2153
+ - eval/samples_per_second:8.707
+ - eval/steps_per_second:8.707
+ - train/epoch:49.62
+ - train/global_step:4,850
+ - train/grad_norm:3.5548949241638184
+ - train/learning_rate:0
+ - train/loss:0.8596
+ - train/total_flos:236,149,029,419,876,350
+ - train/train_loss:1.105836234535139
+ - train/train_runtime:13,237.4947
+ - train/train_samples_per_second:5.9
+ - train/train_steps_per_second:0.366
 ## Environmental impact