Create README.md
Browse filesModel Summary
Architecture: Decoder-only Transformer (GPT-style)
Layers: 1 decoder layer
Hidden Size: 64
Attention Heads: 4
Vocabulary Size: 50257 (based on distilgpt2)
Max Sequence Length: 64
Parameters: < 1M
Framework: PyTorch
Training Data: Synthetic English text
Use Case: Text generation
How to Use
python
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("hannanechiporenko25/decoder-only-transformer-small")
model = AutoModelForCausalLM.from_pretrained("hannanechiporenko25/decoder-only-transformer-small")
input_ids = tokenizer("To be or not to be", return_tensors="pt").input_ids
output = model.generate(input_ids, max_length=50)
print(tokenizer.decode(output[0]))
Training Details
This model was trained for 5 epochs on a small synthetic corpus to demonstrate the structure and behavior of a decoder-only transformer.
Loss decreased gradually, and the model generates coherent (but limited) sequences.
Intended Use
Teaching / experimentation
Lightweight text generation demos
Transformer architecture playground
**Теги**:
- `text-generation`
- `causal-lm`
- `decoder-only`
- `small-model`