|
--- |
|
base_model: unsloth/Qwen3-0.6B |
|
library_name: peft |
|
license: mit |
|
datasets: |
|
- unsloth/OpenMathReasoning-mini |
|
- mlabonne/FineTome-100k |
|
language: |
|
- en |
|
pipeline_tag: question-answering |
|
tags: |
|
- Math |
|
--- |
|
|
|
# Model Card for Qwen3-0.6B-OpenMathReason |
|
|
|
### Model Description |
|
|
|
This model is fine-tuned version of Qwen/Qwen3-0.6B using the Unsloth library and LoRA for parameter-efficient training. |
|
This model is trained on two datasets: |
|
- unsloth/OpenMathReason-mini — for enhancing mathematical reasoning skills. |
|
- mlabonne/FineTome-100k — to improve general conversational abilities. |
|
|
|
#### Model Details |
|
|
|
- **Developed by:** Rustam Shiriyev |
|
- **Language(s) (NLP):** English |
|
- **License:** MIT |
|
- **Finetuned from model:** unsloth/Qwen3-0.6B |
|
|
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
This model can be used as a lightweight assistant capable of solving basic to intermediate math problems (OpenMathReason tasks). |
|
|
|
### Downstream Use |
|
|
|
- Can be integrated into educational chatbots for STEM learning. |
|
|
|
### Out-of-Scope Use |
|
|
|
- Not suitable for high-stakes decision-making. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
- Mathematical reasoning is limited to the scope of the OpenMathReason-mini dataset. |
|
- Conversational quality may degrade with complex or multi-turn inputs. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
from transformers import TextStreamer |
|
from huggingface_hub import login |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from peft import PeftModel |
|
|
|
|
|
login(token="") |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-0.6B",) |
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
"unsloth/Qwen3-0.6B", |
|
device_map={"": 0}, token="" |
|
) |
|
|
|
model = PeftModel.from_pretrained(base_model,"Rustamshry/Qwen3-0.6B-OpenMathReason") |
|
|
|
question = "Solve (x + 2)^2 = 0" |
|
|
|
messages = [ |
|
{"role" : "user", "content" : question} |
|
] |
|
|
|
text = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize = False, |
|
add_generation_prompt = True, |
|
enable_thinking = True, |
|
) |
|
|
|
_ = model.generate( |
|
**tokenizer(text, return_tensors = "pt").to(model.device), |
|
max_new_tokens = 2048, |
|
temperature = 0.6, top_p = 0.95, top_k = 20, |
|
streamer = TextStreamer(tokenizer, skip_prompt = True), |
|
) |
|
``` |
|
## Training Details |
|
|
|
### Training Data |
|
|
|
- unsloth/OpenMathReason-mini: 10k+ instruction-following examples focused on math. |
|
- mlabonne/FineTome-100k: 100k examples of diverse, high-quality chat data. |
|
|
|
### Training Procedure |
|
|
|
- batch size=8, |
|
- gradient accumulation steps=2, |
|
- optimizer=adamw_torch, |
|
- learning rate=2e-5, |
|
- warmup steps=100, |
|
- fp16=True, |
|
- dataloader_num_workers=16, |
|
- num_train_epochs=1, |
|
- weight_decay=0.01, |
|
- lr_scheduler_type = "linear" |
|
|
|
|
|
|
|
### Results |
|
|
|
- Loss Value >> 0.56 |
|
|
|
### Framework versions |
|
|
|
- PEFT 0.14.0 |