Spaces:
Runtime error
Runtime error
File size: 1,126 Bytes
df0ed92 dd785f5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
---
title: GPT From Scratch
emoji: ⚡
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 4.4.0
app_file: app.py
pinned: false
license: mit
---
# GPT from scratch
This repo contains code to train a GPT from scratch. The dataset is taken from the [RedPajama 1 trillion data](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T-Sample). Only samples from this are taken and used for the training purposes. The implementation of the transformer is similar to the [LitGPT](https://github.com/Lightning-AI/lit-gpt).
The trained model has a parameter count of about 160M. The final training loss was found to be 3.2154.

The training details can be found in the attached notebooks. The initial training was stopped when the loss was around 4.

Using the checkpoint, the training was resumed and stopped when it went below 3.5.
Github link - https://github.com/mkthoma/gpt_from_scratch |