Toy_GPTs_LLMs_for_CPU_Educational
/
With one layer, n_layer 1, n_embd 4 is failure. but n_embd 6 is marginal success.
Update With one layer, n_layer 1, n_embd 4 is failure. but n_embd 6 is marginal success.
73745d6
verified
At n_embd': 4, no coherence in response was obtained. | |
At n_embd': 6, 'n_layer': 1, 'n_head': 1, 'n_inner': 64, the Toy Gettysburg GPT-2 model got a good start with "four score and seven years ago our fathers brought forth on this continent , a new nation , conceived in" before some mistakes. But resumed another whole part of the Gettysburg speech: "that all men are created equal . now we are engaged in a great civil war , testing whether that nation , or any nation so conceived and so dedicated , can long endure . we are met on a great battle - field of that war . we have come to dedicate a portion of that field , as a final resting place for those who here gave their lives that that nation might endure " | |
Adding a second layer to the 6-float model (n_embd': 6, 'n_layer': 2, 'n_head': 1, 'n_inner': 64,) (and no other modifications) did solve the glitch, after almost 60,000 epochs (and an expertly timed gradually-receeding learning rate): | |
The resulting model_checkpoint_early_stop_Gettysburg_GPT2_v1.4.2.1.py_2024-11-29_01-41-39.pth has Size on Disk of only 0.99 MB (1,040,384 bytes) | |
A Loss BELOW 0.01 is usually sufficient to obtain a Complete Recital of the entire Gettysburg Address. But, I pushed to epoch loss down to 0.001, whatever that means. | |
Epoch 22361/100000, Loss: 0.0054 | |
LOSS IS BELOW 0.01 | |
Epoch 22362/100000, Loss: 0.0033 | |
LOSS IS BELOW 0.01 | |
Epoch 22363/100000, Loss: 0.0044 | |
LOSS IS BELOW 0.01 | |
Epoch 22364/100000, Loss: 0.0032 | |
Epoch 26651/100000, Loss: 0.0024 | |
LOSS IS BELOW 0.01 | |
Epoch 26652/100000, Loss: 0.0039 | |
LOSS IS BELOW 0.01 | |
Epoch 26653/100000, Loss: 0.0024 | |
LOSS IS BELOW 0.01 | |
Epoch 26654/100000, Loss: 0.0034 | |
LOSS IS BELOW 0.01 | |
Epoch 35255/100000, Loss: 0.0017 | |
LOSS IS BELOW 0.01 | |
Epoch 35256/100000, Loss: 0.0018 | |
LOSS IS BELOW 0.01 | |
Epoch 35257/100000, Loss: 0.0015 | |
LOSS IS BELOW 0.01 | |
Epoch 35258/100000, Loss: 0.0024 | |
LOSS IS BELOW 0.01 | |
Epoch 35259/100000, Loss: 0.0021 | |
LOSS IS BELOW 0.01 | |
Epoch 35260/100000, Loss: 0.0042 | |
LOSS IS BELOW 0.01 | |
Epoch 44408/100000, Loss: 0.0015 | |
LOSS IS BELOW 0.01 | |
Learning rate reduced to 0.000034 | |
Epoch 44408/100000, Loss: 0.0015, Learning Rate: 0.000034 | |
Epoch 44409/100000, Loss: 0.0014 | |
LOSS IS BELOW 0.01 | |
Epoch 44410/100000, Loss: 0.0065 | |
LOSS IS BELOW 0.01 | |
Epoch 44411/100000, Loss: 0.0028 | |
Epoch 55978/100000, Loss: 0.0016 | |
LOSS IS BELOW 0.01 | |
Epoch 55979/100000, Loss: 0.0020 | |
LOSS IS BELOW 0.01 | |
Learning rate reduced to 0.000011 | |
Epoch 55979/100000, Loss: 0.0020, Learning Rate: 0.000011 | |
Epoch 55980/100000, Loss: 0.0016 | |
LOSS IS BELOW 0.01 | |
Epoch 55981/100000, Loss: 0.0014 | |
LOSS IS BELOW 0.01 | |
Epoch 58992/100000, Loss: 0.0014 | |
LOSS IS BELOW 0.01 | |
Epoch 58993/100000, Loss: 0.0030 | |
LOSS IS BELOW 0.01 | |
Epoch 58994/100000, Loss: 0.0014 | |
LOSS IS BELOW 0.01 | |
Epoch 58995/100000, Loss: 0.0010 | |
LOSS IS BELOW 0.01 | |
LOSS IS BELOW 0.001 | |
Early stopping: Average loss 0.0010 is below the threshold (0.001). | |
# --- Inference Examples --- at script line 431 | |
# Example 1: Recite the Gettysburg Address at script line 435 | |
Prompt: four score | |
Response: | |
four score and seven years ago our fathers brought forth on this continent , a new nation , conceived in liberty , and dedicated to the proposition that all men are created equal . now we are engaged in a great civil war , testing whether that nation , or any nation so conceived and so dedicated , can long endure . we are met on a great battle - field of that war . we have come to dedicate a portion of that field , as a final resting place for those who here gave their lives that that nation might live . it is altogether fitting and proper that we should do this . but , in a larger sense , we can not dedicate - we can not consecrate - we can not hallow - this ground . the brave men , living and dead , who struggled here , have consecrated it , far above our poor power to add or detract . the world will little note , nor long remember what we say here , but it can never forget what they did here . it is for us the living , rather , to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced . it is rather for us to be here dedicated to the great task remaining before us - that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion - that we here highly resolve that these dead shall not have died in vain - that this nation , under god , shall have a new birth of freedom - and that government of the people , by the people , for the people , shall not perish from the earth . apple blossom cantaloupe durian elderberry fig guava honeydew iguana iguana iguana iguana iguana iguana iguana iguana iguana measure god apple . we we we we we we we | |
# Example 2: Free text generation after encountering <FreetheLLM> at script line 445 | |
Prompt: we here highly resolve that these dead shall not have died in vain and that this nation under god shall have a new <FreetheLLM> | |
Freestyle Generation: | |
we here highly resolve that these dead shall not have died in vain and that this nation under god shall have a new <pad> <pad> <pad> vain to to men are created equal . now we are engaged in a great civil war , testing whether that nation , or any nation so conceived and so dedicated , can long endure . we are met on a great battle - field of that war . we have come to dedicate a portion of that field , as a final resting place for those who here gave their lives that that nation might live . it is altogether fitting and proper that we should do this . but , in a larger sense , we can not | |
HyperParamters = {'vocab_size': 170, 'special_tokens': | |