Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
kalo-team
/
llama3-4x8b-pythonT2_step_final
like
13
Follow
kalo-team
18
Text Generation
Transformers
Safetensors
English
mixtral
code
conversational
text-generation-inference
Inference Endpoints
arxiv:
2303.01610
License:
llama3
Model card
Files
Files and versions
Community
1
Train
Deploy
Use this model
kalomaze
commited on
May 20, 2024
Commit
f4ccb00
·
verified
·
1 Parent(s):
15be534
Create README.md
Browse files
Files changed (1)
hide
show
README.md
+2
-0
README.md
ADDED
Viewed
@@ -0,0 +1,2 @@
1
+
lr = 2e-6, ~2.5 mil tokens of Python instruct data, all around ~7k tokens ish for each sample (300 total samples).
2
+
1 epoch distillation of 70b logprobs, topk=200