YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

DeBERTa trained from scratch

continued training from https://huggingface.co/mikesong724/deberta-wiki-2006

Source data: https://dumps.wikimedia.org/archive/2010/

Tools used: https://github.com/mikesong724/Point-in-Time-Language-Model

2010 wiki archive 6.1 GB trained 18 epochs = 108GB + 2006 (65GB)

GLUE benchmark

cola (3e): matthews corr: 0.3640

sst2 (3e): acc: 0.9106

mrpc (5e): F1: 0.8505, acc: 0.7794

stsb (3e): pearson: 0.8339, spearman: 0.8312

qqp (3e): acc: 0.8965, F1: 0.8604

mnli (3e): acc_mm: 0.8023

qnli (3e): acc: 0.8889

rte (3e): acc: 0.5271										

wnli (5e): acc: 0.3380
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.