Facilitating large language model Russian adaptation with Learned Embedding Propagation Paper • 2412.21140 • Published 14 days ago • 14
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis Paper • 2412.01819 • Published Dec 2, 2024 • 34
Multi-Granularity Prediction for Scene Text Recognition Paper • 2209.03592 • Published Sep 8, 2022 • 2
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 260
Constraint Back-translation Improves Complex Instruction Following of Large Language Models Paper • 2410.24175 • Published Oct 31, 2024 • 17
Language Models can Self-Lengthen to Generate Long Texts Paper • 2410.23933 • Published Oct 31, 2024 • 17
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 459
Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources Paper • 2409.08239 • Published Sep 12, 2024 • 17
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Paper • 2409.06820 • Published Sep 10, 2024 • 64
Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation Paper • 2409.03271 • Published Sep 5, 2024 • 2
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11, 2024 • 31
Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian Paper • 2405.13929 • Published May 22, 2024 • 54
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Paper • 2405.08748 • Published May 14, 2024 • 19
SUTRA: Scalable Multilingual Language Model Architecture Paper • 2405.06694 • Published May 7, 2024 • 37