Thunderbee
/

gptx_test

Text Generation

Model card Files Files and versions Community

gptx_test / README.md

Thunderbee's picture

Upload README.md with huggingface_hub

46c3dec verified 23 days ago

|

history blame contribute delete

441 Bytes

metadata

language: en
tags:
  - pytorch
  - gpt2
  - language-model
pipeline_tag: text-generation

GPT-X Model

This model was trained using the GPT-X framework.

Model Architecture

Layers: 12
Attention Heads: 12
Hidden Size: 768
Vocabulary Size: 50257
Maximum Sequence Length: 1024
Model Type: base

Training Details

Batch Size: 524288
Learning Rate: 0.0006
Weight Decay: 0.0
Mixed Precision: True
Optimizer: muon