vilm
/

VyLinh-Lite-preview

Model card Files Files and versions Community

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

VyLinh-Lite: Vietnamese 3B Reasoning Language Model

Model Details

Language(s): Vietnamese
Base Model: Qwen2.5-3B
Model Size: 3 billion parameters

Intended Use

Primary intended uses: Vietnamese language understanding, reasoning, and generation
Primary intended users: Researchers, developers, and practitioners working with Vietnamese language AI
Out-of-scope use cases: Production deployments without additional safety measures

Training Details

Training Data

The model underwent a sophisticated training process involving multiple stages of distillation and adaptation:

Initial knowledge distillation from Llama 3.1 405B
Architecture adaptation using mergekit-tokensurgeon
Secondary distillation to Qwen architecture
Parallel distillation from Qwen2-72B
Final fusion and fine-tuning using EvolKit dataset

Training Procedure

Distillation Process

Logit Distillation
- Source: Llama 3.1 405B
- Method: Offline distillation
- Storage: Top-K logits preservation
Cross-Architecture Adaptation
- Tool: mergekit-tokensurgeon
- Process: Vocabulary alignment with Llama 3.1 405B
Architecture Transformation
- Target: 3B parameter configuration
- Method: Progressive knowledge transfer

Fine-tuning

Final Stage: EvolKit dataset utilization
Optimization: Focus on coherence and reasoning capabilities
Vocabulary: Qwen-native vocabulary restoration

Performance and Limitations

Benchmarks

Will be updated throughout the day

Limitations

Model size constraints may impact certain complex reasoning tasks
Performance may vary on domain-specific Vietnamese content
Limited context window compared to larger models

Ethical Considerations

Data Bias: May reflect biases present in training data
Environmental Impact: Reduced compared to larger models due to efficient distillation
Societal Impact: Potential influence on Vietnamese language technology landscape

Technical Specifications

Parameter Count: 3 billion
Context Window: 32K

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.