File size: 5,082 Bytes

---
license: other
base_model: Qwen/Qwen1.5-4B
tags:
- generated_from_trainer
datasets:
- tyzhu/lmind_hotpot_train8000_eval7405_v1_qa
metrics:
- accuracy
model-index:
- name: lmind_hotpot_train8000_eval7405_v1_qa_1e-4_lora2
  results:
  - task:
      name: Causal Language Modeling
      type: text-generation
    dataset:
      name: tyzhu/lmind_hotpot_train8000_eval7405_v1_qa
      type: tyzhu/lmind_hotpot_train8000_eval7405_v1_qa
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.4897142857142857
library_name: peft
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# lmind_hotpot_train8000_eval7405_v1_qa_1e-4_lora2

This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on the tyzhu/lmind_hotpot_train8000_eval7405_v1_qa dataset.
It achieves the following results on the evaluation set:
- Loss: 4.1528
- Accuracy: 0.4897

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- total_eval_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 50.0

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Accuracy |
|:-------------:|:-----:|:-----:|:---------------:|:--------:|
| 2.2503        | 1.0   | 250   | 2.3237          | 0.5156   |
| 2.087         | 2.0   | 500   | 2.3309          | 0.5164   |
| 1.849         | 3.0   | 750   | 2.4019          | 0.5145   |
| 1.6193        | 4.0   | 1000  | 2.5039          | 0.5104   |
| 1.3666        | 5.0   | 1250  | 2.6544          | 0.5050   |
| 1.1435        | 6.0   | 1500  | 2.8436          | 0.5011   |
| 0.9171        | 7.0   | 1750  | 3.0320          | 0.4971   |
| 0.7531        | 8.0   | 2000  | 3.2585          | 0.4930   |
| 0.6101        | 9.0   | 2250  | 3.3418          | 0.4925   |
| 0.5392        | 10.0  | 2500  | 3.5373          | 0.4916   |
| 0.4718        | 11.0  | 2750  | 3.6313          | 0.4893   |
| 0.4446        | 12.0  | 3000  | 3.6736          | 0.4906   |
| 0.4204        | 13.0  | 3250  | 3.7342          | 0.4906   |
| 0.4131        | 14.0  | 3500  | 3.7778          | 0.4897   |
| 0.3924        | 15.0  | 3750  | 3.8210          | 0.4897   |
| 0.3913        | 16.0  | 4000  | 3.8833          | 0.4904   |
| 0.376         | 17.0  | 4250  | 3.8936          | 0.4898   |
| 0.3785        | 18.0  | 4500  | 3.8824          | 0.49     |
| 0.367         | 19.0  | 4750  | 3.9720          | 0.4901   |
| 0.3676        | 20.0  | 5000  | 3.9374          | 0.4909   |
| 0.3602        | 21.0  | 5250  | 3.9380          | 0.4904   |
| 0.3639        | 22.0  | 5500  | 3.9516          | 0.4910   |
| 0.3533        | 23.0  | 5750  | 4.0207          | 0.4916   |
| 0.3587        | 24.0  | 6000  | 3.9905          | 0.4917   |
| 0.3479        | 25.0  | 6250  | 4.0617          | 0.4915   |
| 0.3511        | 26.0  | 6500  | 4.0106          | 0.4903   |
| 0.3442        | 27.0  | 6750  | 4.0401          | 0.4910   |
| 0.3496        | 28.0  | 7000  | 4.0157          | 0.4897   |
| 0.34          | 29.0  | 7250  | 4.0503          | 0.4902   |
| 0.3448        | 30.0  | 7500  | 4.0786          | 0.4908   |
| 0.3406        | 31.0  | 7750  | 4.1239          | 0.4905   |
| 0.3375        | 32.0  | 8000  | 4.1210          | 0.4915   |
| 0.339         | 33.0  | 8250  | 4.1039          | 0.4898   |
| 0.3418        | 34.0  | 8500  | 4.0879          | 0.4902   |
| 0.3364        | 35.0  | 8750  | 4.0782          | 0.4907   |
| 0.3421        | 36.0  | 9000  | 4.0512          | 0.4910   |
| 0.3337        | 37.0  | 9250  | 4.1727          | 0.4895   |
| 0.3375        | 38.0  | 9500  | 4.1615          | 0.4889   |
| 0.3304        | 39.0  | 9750  | 4.1755          | 0.4899   |
| 0.3341        | 40.0  | 10000 | 4.1542          | 0.4903   |
| 0.3311        | 41.0  | 10250 | 4.1479          | 0.4889   |
| 0.3337        | 42.0  | 10500 | 4.1005          | 0.4907   |
| 0.3284        | 43.0  | 10750 | 4.1688          | 0.4909   |
| 0.3343        | 44.0  | 11000 | 4.1412          | 0.4904   |
| 0.3301        | 45.0  | 11250 | 4.0906          | 0.4917   |
| 0.3307        | 46.0  | 11500 | 4.1221          | 0.4895   |
| 0.328         | 47.0  | 11750 | 4.1250          | 0.4892   |
| 0.3293        | 48.0  | 12000 | 4.1082          | 0.4911   |
| 0.3261        | 49.0  | 12250 | 4.1219          | 0.4903   |
| 0.3279        | 50.0  | 12500 | 4.1528          | 0.4897   |


### Framework versions

- PEFT 0.5.0
- Transformers 4.41.1
- Pytorch 2.1.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1