Model Card for Model ID
This model is a finetuned version of Prometheus2-8x7b-hf using a Dataset Feedback Collection with questions, answers, and evaluations related to general subjects.
Model Details
It used a QLORA version for the finetuning process.
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: Guilherme Tavares dos Santos
- Model type: Transformer
- Language(s) (NLP): Python, Pytorch
- License: Apache License 2.0 Open-sourced for research and commercial use (https://github.com/prometheus/prometheus/blob/main/LICENSE)
- Finetuned from the model: QLORA
Uses
Use destinated to improve automatic healthcare evaluations. LLM as a Judge.
Direct Use
Bias, Risks, and Limitations
The dataset used is not specialized in healthcare questions, answers, and evaluation with rubric scores. The main risk is getting evaluations not so customized to our proposal.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. More specific healthcare data is needed for further recommendations.
Training Data
The prometheus-eval/Feedback-Collection dataset contains 9,996 rows. (90%) of those rows was randomized and used for training.
Training Procedure
It wasn't a full finetuning, but only of the following linear layers of the transformer model: Percentage of trained parameters Trainable: 121112576 Total: 46823905280 Percentage: 0.2587% Linear Layers: ['w3', 'o_proj', 'q_proj', 'gate', 'v_proj', 'w1', 'w2', 'k_proj']
Training Hyperparameters
- Training regime: [More Information Needed]
- For LORA step it was used the following hyperparameters: r=8, lora_alpha=32, target_modules=modules, lora_dropout=0.05
The hiperparameters related to the training process:
per_device_train_batch_size=1,
gradient_accumulation_steps=4,
warmup_steps=0,
max_steps=100,
learning_rate=2e-4,
logging_steps=20,
output_dir="outputs",
optim="paged_adamw_8bit",
save_strategy="epoch"
Metrics
It was used for the assessment of the training process Loss Function (built-in Cross-entropy for Transformers). It's the cross-entropy between the predicted probability matrix between the predicted probability matrix ∥sentence length∥×∥vocab∥ (right before taking the argmax to find the token to output), and the ∥sentence length∥ -length vector of token IDs as the true label. It is a method of evaluating how well your algorithm models your dataset. In this case, it's a method to assess how well a transformer predicts the evaluations for each pair question and generated answer. There is a decrease in the Loss function indicating a learning process. Ideally, adding more data and adjusting hyperparameters can close this function to zero.
Results
The metrics related to the Finetuning process:
global_step=100 train_runtime: 1584 s outputs train/loss: 0.4637 https://wandb.ai/eeg-neuko/huggingface/runs/bfnozz0c?nw=nwuserguicondor1512
Hardware
It used a GPU A100 with VRAM of 40 GB from Google Colab Pro+.
- Downloads last month
- 2