Update README.md
Browse files
README.md
CHANGED
@@ -77,7 +77,7 @@ widget:
|
|
77 |
<h1 style="font-size: 42px">GPT-JT<h1/>
|
78 |
|
79 |
# Model Summary
|
80 |
-
We present GPT-JT, a fork of GPT-6B, trained
|
81 |
GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
|
82 |
|
83 |
**Please check out our [Online Demo](https://huggingface.co/spaces/togethercomputer/GPT-JT)!**
|
|
|
77 |
<h1 style="font-size: 42px">GPT-JT<h1/>
|
78 |
|
79 |
# Model Summary
|
80 |
+
We present GPT-JT, a fork of GPT-6B, trained on 3.5 billion tokens, that outperforms most 100B+ parameter models at classification, and improves most tasks relative to GPT-J-6B. GPT-JT was trained with a new decentralized algorithm on computers networked on slow 1Gbps links.
|
81 |
GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
|
82 |
|
83 |
**Please check out our [Online Demo](https://huggingface.co/spaces/togethercomputer/GPT-JT)!**
|