togethercomputer
/

GPT-JT-6B-v1

Text Generation

Model card Files Files and versions

juewang commited on Nov 26, 2022

Commit

0908ce0

·

1 Parent(s): ab73a1e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -77,7 +77,7 @@ widget:
 <h1 style="font-size: 42px">GPT-JT<h1/>
 # Model Summary
-We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks relative to GPT-J-6B. GPT-JT was trained with a new decentralized algorithm on computers networked on slow 1Gbps links.
 GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
 **Please check out our [Online Demo](https://huggingface.co/spaces/togethercomputer/GPT-JT)!**

 <h1 style="font-size: 42px">GPT-JT<h1/>
 # Model Summary
+We present GPT-JT, a fork of GPT-6B, trained on 3.5 billion tokens, that outperforms most 100B+ parameter models at classification, and improves most tasks relative to GPT-J-6B. GPT-JT was trained with a new decentralized algorithm on computers networked on slow 1Gbps links.
 GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
 **Please check out our [Online Demo](https://huggingface.co/spaces/togethercomputer/GPT-JT)!**