naxautify
/

gpt2-4k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gpt2-4k / README.md

naxalpha's picture

add more info

fca7629 over 1 year ago

|

330 Bytes

	# GPT-2 (125M) 4k tokens

	Fine-tuned GPT2 Smallest model on The Pile with a token length of 4k.
	Weights are included and it follows Karpathy's nanoGPT implementation.
	The model has been trained for ~1 million iterations with increasing batch size, ending at 32k.
	The final loss is 3.9 which is probably due to 768 embedding size.