DrishtiSharma
/

DialoGPT-large-faqs-block-size-128-bs-16-lr-7e-6

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

DialoGPT-large-faqs-block-size-128-bs-16-lr-7e-6

This model is a fine-tuned version of microsoft/DialoGPT-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.4362

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 7e-06
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	40	4.4791
No log	2.0	80	3.7462
No log	3.0	120	3.2760
No log	4.0	160	3.0066
No log	5.0	200	2.8421
No log	6.0	240	2.7291
No log	7.0	280	2.6535
No log	8.0	320	2.5975
No log	9.0	360	2.5532
No log	10.0	400	2.5265
No log	11.0	440	2.4987
No log	12.0	480	2.4778
2.9559	13.0	520	2.4655
2.9559	14.0	560	2.4553
2.9559	15.0	600	2.4449
2.9559	16.0	640	2.4456
2.9559	17.0	680	2.4389
2.9559	18.0	720	2.4384
2.9559	19.0	760	2.4372
2.9559	20.0	800	2.4362

Framework versions

Transformers 4.33.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.14.4.dev0
Tokenizers 0.13.3

Downloads last month: 6

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for DrishtiSharma/DialoGPT-large-faqs-block-size-128-bs-16-lr-7e-6

Base model

microsoft/DialoGPT-large

Finetuned

(20)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard