timpal0l commited on
Commit
b1ddc90
·
verified ·
1 Parent(s): e618bed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -6
README.md CHANGED
@@ -58,15 +58,10 @@ Ikväll blir det grillat och det ser jag fram emot!"
58
  `AI-Sweden-Models/Llama-3-8B` is a continuation of the pretraining process from `meta-llama/Meta-Llama-3-8B`.
59
  It was trained on a subset from [The Nordic Pile](https://arxiv.org/abs/2303.17183) containing Swedish, Norwegian and Danish.
60
 
61
- The training dataset consists of 227 105 079 296 tokens.
62
 
63
  ![](https://huggingface.co/AI-Sweden-Models/Llama-3-8B/resolve/main/13333333.jpg?download=true)
64
 
65
-
66
- ## Benchmarks
67
-
68
- Coming soon.
69
-
70
  ## Checkpoints
71
  * 15/6/2024 (18833) => 1 epoch
72
  * 11/6/2024 (16000)
 
58
  `AI-Sweden-Models/Llama-3-8B` is a continuation of the pretraining process from `meta-llama/Meta-Llama-3-8B`.
59
  It was trained on a subset from [The Nordic Pile](https://arxiv.org/abs/2303.17183) containing Swedish, Norwegian and Danish.
60
 
61
+ The training dataset consists of 227 105 079 296 tokens. It was trained on the Rattler supercomputer at the Dell Technologies Edge Innovation Center in Austin, Texas. The training used 23 nodes of a duration of 30 days, where one node contained 4X Nvidia A100 GPUs, yielding 92 GPUs.
62
 
63
  ![](https://huggingface.co/AI-Sweden-Models/Llama-3-8B/resolve/main/13333333.jpg?download=true)
64
 
 
 
 
 
 
65
  ## Checkpoints
66
  * 15/6/2024 (18833) => 1 epoch
67
  * 11/6/2024 (16000)