Malaysian synthetic dataset Collection Use LLM to generate Malaysian context synthetic dataset. • 33 items • Updated 1 day ago • 1
Text-to-Speech dataset Collection Malay Text-to-Speech dataset, gathered from crawled audiobooks and online TTS. • 13 items • Updated 1 day ago • 1
Audio Multimodal dataset Collection Audio Multimodal Malaysian dataset, Audio QA. • 3 items • Updated 1 day ago
Multi-Lingual Malaysian Embedding: Leveraging Large Language Models for Semantic Representations Paper • 2402.03053 • Published Feb 5 • 2
Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding Paper • 2401.13565 • Published Jan 24 • 3