Kristaller486's picture

Kristaller486

kristaller486

·

AI & ML interests

NLP, Machine Translation

Recent Activity

upvoted a paper 13 days ago

Facilitating large language model Russian adaptation with Learned Embedding Propagation

upvoted a collection 19 days ago

liked a model 19 days ago

deepseek-ai/DeepSeek-V3

View all activity

Organizations

kristaller486's activity

upvoted a paper 13 days ago

Facilitating large language model Russian adaptation with Learned Embedding Propagation

Paper • 2412.21140 • Published 14 days ago • 14

upvoted a collection 19 days ago

DeepSeek-V3

3 items • Updated 8 days ago • 112

upvoted a collection about 1 month ago

FineWeb2 Collaborative Annotation Sprint

5 items • Updated 20 days ago • 6

upvoted a paper about 1 month ago

Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis

Paper • 2412.01819 • Published Dec 2, 2024 • 34

upvoted a paper about 2 months ago

Multi-Granularity Prediction for Scene Text Recognition

Paper • 2209.03592 • Published Sep 8, 2022 • 2

upvoted a collection 2 months ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 260

upvoted 2 papers 2 months ago

Constraint Back-translation Improves Complex Instruction Following of Large Language Models

Paper • 2410.24175 • Published Oct 31, 2024 • 17

Language Models can Self-Lengthen to Generate Long Texts

Paper • 2410.23933 • Published Oct 31, 2024 • 17

upvoted a collection 3 months ago

DocLayout-YOLO

Dataset and model for DocLayout-YOLO • 9 items • Updated Oct 22, 2024 • 12

upvoted a collection 4 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 459

upvoted 4 papers 4 months ago

GRIN: GRadient-INformed MoE

Paper • 2409.12136 • Published Sep 18, 2024 • 16

Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources

Paper • 2409.08239 • Published Sep 12, 2024 • 17

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published Sep 10, 2024 • 64

Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation

Paper • 2409.03271 • Published Sep 5, 2024 • 2

upvoted a paper 6 months ago

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Paper • 2407.08296 • Published Jul 11, 2024 • 31

upvoted a paper 7 months ago

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 95

upvoted 4 papers 8 months ago

Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian

Paper • 2405.13929 • Published May 22, 2024 • 54

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Paper • 2405.08748 • Published May 14, 2024 • 19

SUTRA: Scalable Multilingual Language Model Architecture

Paper • 2405.06694 • Published May 7, 2024 • 37

WildChat: 1M ChatGPT Interaction Logs in the Wild

Paper • 2405.01470 • Published May 2, 2024 • 62