Update README.md
Browse files
README.md
CHANGED
@@ -39,7 +39,7 @@ tags:
|
|
39 |
- Created by [David Xue](https://www.linkedin.com/in/david-xue-uva/) from [Astronomer](https://astronomer.io)
|
40 |
|
41 |
## Description
|
42 |
-
This is the exact same model ([meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)) with the weights for the input and output embeddings from lm head and embedding matrix adjusted for certain tokens that were untrained which caused widespread issues for people attempting to fine-tune this base model with either adding their own tokens or using existing special tokens.
|
43 |
|
44 |
## Why We Made This Model
|
45 |
|
|
|
39 |
- Created by [David Xue](https://www.linkedin.com/in/david-xue-uva/) from [Astronomer](https://astronomer.io)
|
40 |
|
41 |
## Description
|
42 |
+
This is the exact same model ([meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)) with the weights for the input and output embeddings from lm head and embedding matrix adjusted using the mean of the trained tokens for certain tokens that were untrained, which caused widespread issues for people attempting to fine-tune this base model with either adding their own tokens or using existing special tokens.
|
43 |
|
44 |
## Why We Made This Model
|
45 |
|