--- license: cc-by-nc-nd-3.0 --- # VyLinh-Lite: Vietnamese 3B Reasoning Language Model ## Model Details - **Language(s)**: Vietnamese - **Base Model**: Qwen2.5-3B - **Model Size**: 3 billion parameters ## Intended Use - **Primary intended uses**: Vietnamese language understanding, reasoning, and generation - **Primary intended users**: Researchers, developers, and practitioners working with Vietnamese language AI - **Out-of-scope use cases**: Production deployments without additional safety measures ## Training Details ### Training Data The model underwent a sophisticated training process involving multiple stages of distillation and adaptation: 1. Initial knowledge distillation from Llama 3.1 405B 2. Architecture adaptation using mergekit-tokensurgeon 3. Secondary distillation to Qwen architecture 4. Parallel distillation from Qwen2-72B 5. Final fusion and fine-tuning using EvolKit dataset ### Training Procedure #### Distillation Process 1. **Logit Distillation** - Source: Llama 3.1 405B - Method: Offline distillation - Storage: Top-K logits preservation 2. **Cross-Architecture Adaptation** - Tool: mergekit-tokensurgeon - Process: Vocabulary alignment with Llama 3.1 405B 3. **Architecture Transformation** - Target: 3B parameter configuration - Method: Progressive knowledge transfer #### Fine-tuning - **Final Stage**: EvolKit dataset utilization - **Optimization**: Focus on coherence and reasoning capabilities - **Vocabulary**: Qwen-native vocabulary restoration ## Performance and Limitations ### Benchmarks Will be updated throughout the day ### Limitations - Model size constraints may impact certain complex reasoning tasks - Performance may vary on domain-specific Vietnamese content - Limited context window compared to larger models ## Ethical Considerations - **Data Bias**: May reflect biases present in training data - **Environmental Impact**: Reduced compared to larger models due to efficient distillation - **Societal Impact**: Potential influence on Vietnamese language technology landscape ## Technical Specifications - **Parameter Count**: 3 billion - **Context Window**: 32K