File size: 3,777 Bytes

---
base_model: Qwen/Qwen-14B
tags:
- generated_from_trainer
model-index:
- name: OpenAssistant_oasst_top1_2023-08-25
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# OpenAssistant_oasst_top1_2023-08-25

This model is a fine-tuned version of [Qwen/Qwen-14B](https://huggingface.co/Qwen/Qwen-14B) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.6501

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 0.01
- num_epochs: 1

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 2.163         | 0.02  | 16   | 1.9459          |
| 1.9498        | 0.04  | 32   | 1.8467          |
| 1.9578        | 0.06  | 48   | 1.7864          |
| 1.8398        | 0.08  | 64   | 1.7530          |
| 1.7696        | 0.1   | 80   | 1.7076          |
| 1.7744        | 0.12  | 96   | 1.7275          |
| 1.8108        | 0.14  | 112  | 1.6887          |
| 1.7707        | 0.17  | 128  | 1.6942          |
| 1.787         | 0.19  | 144  | 1.6894          |
| 1.7029        | 0.21  | 160  | 1.6760          |
| 1.6732        | 0.23  | 176  | 1.6838          |
| 1.6313        | 0.25  | 192  | 1.6754          |
| 1.7071        | 0.27  | 208  | 1.6752          |
| 1.6781        | 0.29  | 224  | 1.6741          |
| 1.7782        | 0.31  | 240  | 1.6698          |
| 1.6836        | 0.33  | 256  | 1.6592          |
| 1.7229        | 0.35  | 272  | 1.6633          |
| 1.7196        | 0.37  | 288  | 1.6638          |
| 1.6892        | 0.39  | 304  | 1.6627          |
| 1.6844        | 0.41  | 320  | 1.6557          |
| 1.8027        | 0.43  | 336  | 1.6540          |
| 1.692         | 0.45  | 352  | 1.6577          |
| 1.7088        | 0.47  | 368  | 1.6611          |
| 1.7987        | 0.5   | 384  | 1.6557          |
| 1.709         | 0.52  | 400  | 1.6600          |
| 1.701         | 0.54  | 416  | 1.6588          |
| 1.6784        | 0.56  | 432  | 1.6594          |
| 1.6997        | 0.58  | 448  | 1.6484          |
| 1.7811        | 0.6   | 464  | 1.6583          |
| 1.7628        | 0.62  | 480  | 1.6461          |
| 1.6254        | 0.64  | 496  | 1.6527          |
| 1.6684        | 0.66  | 512  | 1.6520          |
| 1.6837        | 0.68  | 528  | 1.6570          |
| 1.7209        | 0.7   | 544  | 1.6543          |
| 1.677         | 0.72  | 560  | 1.6562          |
| 1.6819        | 0.74  | 576  | 1.6517          |
| 1.7072        | 0.76  | 592  | 1.6551          |
| 1.6446        | 0.78  | 608  | 1.6562          |
| 1.6908        | 0.8   | 624  | 1.6528          |
| 1.7209        | 0.83  | 640  | 1.6518          |
| 1.6818        | 0.85  | 656  | 1.6517          |
| 1.7007        | 0.87  | 672  | 1.6525          |
| 1.8077        | 0.89  | 688  | 1.6522          |
| 1.6856        | 0.91  | 704  | 1.6516          |
| 1.7247        | 0.93  | 720  | 1.6509          |
| 1.6645        | 0.95  | 736  | 1.6500          |
| 1.6841        | 0.97  | 752  | 1.6499          |
| 1.7244        | 0.99  | 768  | 1.6501          |


### Framework versions

- Transformers 4.32.0
- Pytorch 2.1.0
- Datasets 2.14.7
- Tokenizers 0.13.3