hannanechiporenko25 commited on
Commit
29899a7
·
verified ·
1 Parent(s): 144f435

Create README.md

Browse files

Model Summary
Architecture: Decoder-only Transformer (GPT-style)
Layers: 1 decoder layer
Hidden Size: 64
Attention Heads: 4
Vocabulary Size: 50257 (based on distilgpt2)
Max Sequence Length: 64
Parameters: < 1M
Framework: PyTorch
Training Data: Synthetic English text
Use Case: Text generation

How to Use
python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("hannanechiporenko25/decoder-only-transformer-small")
model = AutoModelForCausalLM.from_pretrained("hannanechiporenko25/decoder-only-transformer-small")

input_ids = tokenizer("To be or not to be", return_tensors="pt").input_ids
output = model.generate(input_ids, max_length=50)
print(tokenizer.decode(output[0]))
Training Details
This model was trained for 5 epochs on a small synthetic corpus to demonstrate the structure and behavior of a decoder-only transformer.
Loss decreased gradually, and the model generates coherent (but limited) sequences.

Intended Use
Teaching / experimentation
Lightweight text generation demos
Transformer architecture playground

**Теги**:
- `text-generation`
- `causal-lm`
- `decoder-only`
- `small-model`

Files changed (1) hide show
  1. README.md +0 -0
README.md ADDED
File without changes