---
license: apache-2.0
base_model: TheBloke/OpenHermes-2-Mistral-7B-GPTQ
tags:
- generated_from_trainer
model-index:
- name: openhermes-mistral-dpo-gptq
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# openhermes-mistral-dpo-gptq

This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4545
- Rewards/chosen: -0.0587
- Rewards/rejected: -1.0907
- Rewards/accuracies: 0.875
- Rewards/margins: 1.0320
- Logps/rejected: -312.2487
- Logps/chosen: -273.6681
- Logits/rejected: -1.8614
- Logits/chosen: -1.7936

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- training_steps: 50
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.6989        | 0.01  | 10   | 0.6566          | -0.0830        | -0.1482          | 0.75               | 0.0652          | -302.8232      | -273.9107    | -1.8738         | -1.7954       |
| 0.6578        | 0.01  | 20   | 0.5787          | 0.0468         | -0.2201          | 0.8125             | 0.2669          | -303.5421      | -272.6130    | -1.8707         | -1.7965       |
| 0.715         | 0.01  | 30   | 0.5021          | 0.2256         | -0.3134          | 0.8125             | 0.5391          | -304.4756      | -270.8246    | -1.8729         | -1.8014       |
| 0.6847        | 0.02  | 40   | 0.4673          | 0.2097         | -0.6320          | 0.875              | 0.8417          | -307.6610      | -270.9843    | -1.8682         | -1.7996       |
| 0.7869        | 0.03  | 50   | 0.4545          | -0.0587        | -1.0907          | 0.875              | 1.0320          | -312.2487      | -273.6681    | -1.8614         | -1.7936       |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.0.1+cu117
- Datasets 2.15.0
- Tokenizers 0.15.0