File size: 7,095 Bytes

---
license: other
tags:
- generated_from_trainer
datasets:
- AlekseyKorshuk/dalio-all-io
metrics:
- accuracy
model-index:
- name: dalio-all-io-125m-3-epoch
  results:
  - task:
      name: Causal Language Modeling
      type: text-generation
    dataset:
      name: AlekseyKorshuk/dalio-all-io
      type: AlekseyKorshuk/dalio-all-io
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.049654305468258955
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# dalio-all-io-125m-3-epoch

This model is a fine-tuned version of [facebook/opt-125m](https://huggingface.co/facebook/opt-125m) on the AlekseyKorshuk/dalio-all-io dataset.
It achieves the following results on the evaluation set:
- Loss: 2.7656
- Accuracy: 0.0497

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 16
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 3.0

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
| 3.1406        | 0.03  | 1    | 3.0762          | 0.0451   |
| 3.074         | 0.07  | 2    | 3.0762          | 0.0451   |
| 3.0557        | 0.1   | 3    | 3.0762          | 0.0451   |
| 3.2166        | 0.14  | 4    | 3.0176          | 0.0457   |
| 3.0989        | 0.17  | 5    | 2.9922          | 0.0460   |
| 3.0732        | 0.21  | 6    | 2.9746          | 0.0464   |
| 3.0867        | 0.24  | 7    | 2.9629          | 0.0463   |
| 2.979         | 0.28  | 8    | 2.9512          | 0.0467   |
| 3.1838        | 0.31  | 9    | 2.9414          | 0.0467   |
| 2.9399        | 0.34  | 10   | 2.9336          | 0.0467   |
| 2.926         | 0.38  | 11   | 2.9258          | 0.0471   |
| 3.2144        | 0.41  | 12   | 2.9199          | 0.0473   |
| 2.978         | 0.45  | 13   | 2.9141          | 0.0474   |
| 3.0076        | 0.48  | 14   | 2.9082          | 0.0476   |
| 2.9897        | 0.52  | 15   | 2.9023          | 0.0477   |
| 2.8831        | 0.55  | 16   | 2.8945          | 0.0479   |
| 2.9749        | 0.59  | 17   | 2.8867          | 0.0479   |
| 2.9431        | 0.62  | 18   | 2.8828          | 0.0478   |
| 3.0498        | 0.66  | 19   | 2.8770          | 0.0479   |
| 2.9409        | 0.69  | 20   | 2.8711          | 0.0479   |
| 2.96          | 0.72  | 21   | 2.8672          | 0.0480   |
| 3.0767        | 0.76  | 22   | 2.8633          | 0.0478   |
| 2.772         | 0.79  | 23   | 2.8594          | 0.0479   |
| 3.0574        | 0.83  | 24   | 2.8535          | 0.0480   |
| 2.8137        | 0.86  | 25   | 2.8496          | 0.0480   |
| 2.8872        | 0.9   | 26   | 2.8438          | 0.0483   |
| 3.0085        | 0.93  | 27   | 2.8398          | 0.0484   |
| 2.9165        | 0.97  | 28   | 2.8359          | 0.0485   |
| 2.8525        | 1.0   | 29   | 2.8340          | 0.0486   |
| 2.7759        | 1.03  | 30   | 2.8301          | 0.0485   |
| 2.7312        | 1.07  | 31   | 2.8281          | 0.0485   |
| 2.6641        | 1.1   | 32   | 2.8262          | 0.0487   |
| 2.7896        | 1.14  | 33   | 2.8242          | 0.0486   |
| 2.7878        | 1.17  | 34   | 2.8223          | 0.0487   |
| 2.4028        | 1.21  | 35   | 2.8203          | 0.0487   |
| 2.5618        | 1.24  | 36   | 2.8184          | 0.0488   |
| 2.6697        | 1.28  | 37   | 2.8164          | 0.0488   |
| 2.6333        | 1.31  | 38   | 2.8145          | 0.0487   |
| 2.4897        | 1.34  | 39   | 2.8125          | 0.0486   |
| 2.4908        | 1.38  | 40   | 2.8105          | 0.0487   |
| 2.6926        | 1.41  | 41   | 2.8086          | 0.0488   |
| 2.6602        | 1.45  | 42   | 2.8066          | 0.0489   |
| 2.8054        | 1.48  | 43   | 2.8047          | 0.0489   |
| 2.5532        | 1.52  | 44   | 2.8047          | 0.0490   |
| 2.4756        | 1.55  | 45   | 2.8027          | 0.0491   |
| 2.6123        | 1.59  | 46   | 2.8008          | 0.0491   |
| 2.5117        | 1.62  | 47   | 2.7988          | 0.0490   |
| 2.5552        | 1.66  | 48   | 2.7969          | 0.0490   |
| 2.5122        | 1.69  | 49   | 2.7949          | 0.0490   |
| 2.5593        | 1.72  | 50   | 2.7930          | 0.0491   |
| 2.5759        | 1.76  | 51   | 2.7910          | 0.0491   |
| 2.5535        | 1.79  | 52   | 2.7891          | 0.0493   |
| 2.6531        | 1.83  | 53   | 2.7871          | 0.0494   |
| 2.5701        | 1.86  | 54   | 2.7852          | 0.0495   |
| 2.6621        | 1.9   | 55   | 2.7832          | 0.0497   |
| 2.532         | 1.93  | 56   | 2.7812          | 0.0496   |
| 2.5928        | 1.97  | 57   | 2.7793          | 0.0497   |
| 2.5486        | 2.0   | 58   | 2.7754          | 0.0497   |
| 2.5009        | 2.03  | 59   | 2.7734          | 0.0497   |
| 2.4346        | 2.07  | 60   | 2.7734          | 0.0498   |
| 2.3259        | 2.1   | 61   | 2.7715          | 0.0497   |
| 2.3569        | 2.14  | 62   | 2.7695          | 0.0498   |
| 2.5898        | 2.17  | 63   | 2.7695          | 0.0498   |
| 2.3657        | 2.21  | 64   | 2.7676          | 0.0498   |
| 2.4875        | 2.24  | 65   | 2.7676          | 0.0498   |
| 2.4392        | 2.28  | 66   | 2.7676          | 0.0497   |
| 2.3595        | 2.31  | 67   | 2.7656          | 0.0497   |
| 2.4757        | 2.34  | 68   | 2.7656          | 0.0498   |
| 2.4617        | 2.38  | 69   | 2.7656          | 0.0498   |
| 2.3376        | 2.41  | 70   | 2.7656          | 0.0499   |
| 2.3129        | 2.45  | 71   | 2.7656          | 0.0498   |
| 2.5703        | 2.48  | 72   | 2.7656          | 0.0498   |
| 2.3491        | 2.52  | 73   | 2.7656          | 0.0498   |
| 2.3484        | 2.55  | 74   | 2.7656          | 0.0498   |
| 2.3782        | 2.59  | 75   | 2.7656          | 0.0497   |
| 2.4033        | 2.62  | 76   | 2.7656          | 0.0498   |
| 2.3821        | 2.66  | 77   | 2.7656          | 0.0498   |
| 2.39          | 2.69  | 78   | 2.7656          | 0.0498   |
| 2.3984        | 2.72  | 79   | 2.7656          | 0.0497   |
| 2.3936        | 2.76  | 80   | 2.7656          | 0.0498   |
| 2.4414        | 2.79  | 81   | 2.7656          | 0.0497   |
| 2.4727        | 2.83  | 82   | 2.7656          | 0.0497   |
| 2.3192        | 2.86  | 83   | 2.7656          | 0.0497   |
| 2.4365        | 2.9   | 84   | 2.7656          | 0.0497   |
| 2.5042        | 2.93  | 85   | 2.7656          | 0.0497   |
| 2.4746        | 2.97  | 86   | 2.7656          | 0.0497   |
| 2.5383        | 3.0   | 87   | 2.7656          | 0.0497   |


### Framework versions

- Transformers 4.25.0.dev0
- Pytorch 1.12.1+cu113
- Datasets 2.3.2
- Tokenizers 0.12.1