Democratizing access to LLMs for the open-source community.
Let's advance AI, together.
Introduction π
We are thrilled to announce the open-sourcing of our boomer-634m model, an important milestone in our ongoing AI research. This model, with 634 million parameters, was meticulously pre-trained from scratch on a custom synthetic dataset comprising 12 billion tokens.
Run the model
Here is a quick guide to get you started with boomer-634m:
Please note that, at the moment, trust_remote_code=True
is required for running the model.
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("budecosystem/boomer-634m",
trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("budecosystem/boomer-634m")
input_ids = tokenizer("Explain why the sky is blue.", return_tensors='pt').to(model.device)["input_ids"]
outputs = model.generate(input_ids, max_new_tokens=216)
print(tokenizer.batch_decode(outputs))
Evaluations
The boomer-634m model has been rigorously evaluated on various benchmarks, showcasing its robust performance across different tasks:
Model Name | MMLU | ARC | Hellaswag | GSM8K | Winogrande | MathQA | logiqa |
---|---|---|---|---|---|---|---|
boomer-634m | 25.91 | 29.86 | 39.24 | 1.67 | 50.67 | 23.55 | 28.42 |
Final thought on Boomer!
Embarking on the journey with boomer-634m is just the beginning. We are committed to developing more advanced, efficient, and accessible AI models. Join us in this exciting adventure to shape the future of AI.
Aknowledgements
Our heartfelt thanks go to the open-source community and the trailblazers in AI research whose work has paved the way for innovations like boomer-634m.
- Downloads last month
- 74