Thunderbee
/

gptx_test

Text Generation

Model card Files Files and versions Community

GPT-X Model

This model was trained using the GPT-X framework.

Model Architecture

Layers: 12
Attention Heads: 12
Hidden Size: 768
Vocabulary Size: 50257
Maximum Sequence Length: 1024
Model Type: base

Training Details

Batch Size: 524288
Learning Rate: 0.0006
Weight Decay: 0.0
Mixed Precision: True
Optimizer: muon

Downloads last month: 8

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.