Uploaded better trained version.
Browse files- README.md +3 -1
- model-00001-of-00002.safetensors +1 -1
- model-00002-of-00002.safetensors +1 -1
README.md
CHANGED
@@ -10,9 +10,11 @@ pipeline_tag: text-generation
|
|
10 |
library_name: transformers
|
11 |
---
|
12 |
# GPT4chan 24B AWQ
|
|
|
|
|
13 |
This model is [v2ray/GPT4chan-24B](https://huggingface.co/v2ray/GPT4chan-24B) quantized to int4 using [casper-hansen/AutoAWQ](https://github.com/casper-hansen/AutoAWQ).
|
14 |
|
15 |
-
Trained using 8x H100 with global batch size 64, using 2e-4 learning rate, for
|
16 |
## Prompt Format
|
17 |
```
|
18 |
board<|start_header_id|>id<|end_header_id|>content<|start_header_id|>id<|end_header_id|>content...<|start_header_id|>id<|end_header_id|>
|
|
|
10 |
library_name: transformers
|
11 |
---
|
12 |
# GPT4chan 24B AWQ
|
13 |
+

|
14 |
+
|
15 |
This model is [v2ray/GPT4chan-24B](https://huggingface.co/v2ray/GPT4chan-24B) quantized to int4 using [casper-hansen/AutoAWQ](https://github.com/casper-hansen/AutoAWQ).
|
16 |
|
17 |
+
Trained using 8x H100 with global batch size 64, using 2e-4 learning rate, for 4000 steps, which is approximately 5 epochs.
|
18 |
## Prompt Format
|
19 |
```
|
20 |
board<|start_header_id|>id<|end_header_id|>content<|start_header_id|>id<|end_header_id|>content...<|start_header_id|>id<|end_header_id|>
|
model-00001-of-00002.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 9917518176
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b58bc6e6f998bb4cc4d28cce276591fe4c5351b88a7d7aa79359dc9dd01074e5
|
3 |
size 9917518176
|
model-00002-of-00002.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4316893568
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ea4b0ab5829553d135964b44a234dae8493c7f875ef50d550848a618bf81189e
|
3 |
size 4316893568
|