Update README.md
Browse files
README.md
CHANGED
@@ -192,10 +192,10 @@ widget:
|
|
192 |
Label:
|
193 |
---
|
194 |
|
195 |
-
<h1 style="font-size: 42px">
|
196 |
|
197 |
# Model Summary
|
198 |
-
We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks. GPT-JT was trained with a new decentralized algorithm
|
199 |
GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
|
200 |
|
201 |
**Please check out our demo: [TOMA-app](https://huggingface.co/spaces/togethercomputer/TOMA-app).**
|
@@ -204,7 +204,7 @@ GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3
|
|
204 |
```python
|
205 |
from transformers import pipeline
|
206 |
pipe = pipeline(model='togethercomputer/GPT-JT-6B-v1')
|
207 |
-
pipe('''
|
208 |
```
|
209 |
|
210 |
or
|
|
|
192 |
Label:
|
193 |
---
|
194 |
|
195 |
+
<h1 style="font-size: 42px">GPT-JT<h1/>
|
196 |
|
197 |
# Model Summary
|
198 |
+
We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks relative to GPT-J-6B. GPT-JT was trained with a new decentralized algorithm on computers networked on slow 1Gbps links.
|
199 |
GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
|
200 |
|
201 |
**Please check out our demo: [TOMA-app](https://huggingface.co/spaces/togethercomputer/TOMA-app).**
|
|
|
204 |
```python
|
205 |
from transformers import pipeline
|
206 |
pipe = pipeline(model='togethercomputer/GPT-JT-6B-v1')
|
207 |
+
pipe('''I like this! <-- Is it positive or negative?\nA:''')
|
208 |
```
|
209 |
|
210 |
or
|