Update README.md
Browse files
README.md
CHANGED
@@ -197,7 +197,8 @@ widget:
|
|
197 |
# Model Summary
|
198 |
We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks. GPT-JT was trained with a new decentralized algorithm with 1G interconnect.
|
199 |
GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
|
200 |
-
|
|
|
201 |
|
202 |
# Quick Start
|
203 |
```python
|
|
|
197 |
# Model Summary
|
198 |
We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks. GPT-JT was trained with a new decentralized algorithm with 1G interconnect.
|
199 |
GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
|
200 |
+
|
201 |
+
**Please check out our demo: [TOMA-app](https://huggingface.co/spaces/togethercomputer/TOMA-app).**
|
202 |
|
203 |
# Quick Start
|
204 |
```python
|