Text Generation
Transformers
GGUF
English
Inference Endpoints
leafspark commited on
Commit
1e63462
1 Parent(s): c527fdb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -20,11 +20,11 @@ The GGUFs uploaded are full FP32 precision.
20
  Using OpenOrca GPT-4 data + cosmopedia for some extra data + dolly15k for instruct
21
 
22
  ## Model Details:
23
- - 71.7M parameters (71,775,700)
24
  - 8 attention heads
25
- - 32 layers (34 layers on final model)
26
  - 384 embeddings size
27
- - 2048/8192/16384 context (please use 4x RoPE scaling, may train a 16k finetuned version later)
28
  - Batch size 16
29
  - llama.cpp (train-text-from-scratch)
30
 
@@ -43,7 +43,7 @@ Please structure your prompts in an instruct format for maximum performance.
43
  - 96gb RAM
44
  - 10 iterations
45
  - Loss Target = 2.5 to 3.0
46
- - Approx 30 samples (>0.0001 epoches)
47
  - Training data = Refer to OpenOrca page
48
 
49
  ## Notes:
 
20
  Using OpenOrca GPT-4 data + cosmopedia for some extra data + dolly15k for instruct
21
 
22
  ## Model Details:
23
+ - 83.59M parameters (83591800)
24
  - 8 attention heads
25
+ - 40 layers
26
  - 384 embeddings size
27
+ - 4096/8192/16384 context (please use 2/4x RoPE scaling, may train a 16k finetuned version later)
28
  - Batch size 16
29
  - llama.cpp (train-text-from-scratch)
30
 
 
43
  - 96gb RAM
44
  - 10 iterations
45
  - Loss Target = 2.5 to 3.0
46
+ - Approx 480 samples/1M train tokens (>0.0001 epoches)
47
  - Training data = Refer to OpenOrca page
48
 
49
  ## Notes: