49 259 1070

Jade

euclaise

AI & ML interests

None yet

Recent Activity

liked a model about 1 hour ago

Qwen/Qwen-Image-Edit

upvoted a paper 4 days ago

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

upvoted a paper 4 days ago

μ-Parametrization for Mixture of Experts

View all activity

Organizations

upvoted 4 papers 4 days ago

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

Paper • 2508.09726 • Published 7 days ago • 8

upvoted a paper 8 days ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published 11 days ago • 150

upvoted a paper 27 days ago

Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Paper • 2507.16746 • Published 28 days ago • 33

upvoted 7 papers 28 days ago

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity

Paper • 2507.08771 • Published Jul 11 • 9

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Paper • 2507.12720 • Published Jul 17 • 9

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

Paper • 2507.13158 • Published Jul 17 • 24

The Serial Scaling Hypothesis

Paper • 2507.12549 • Published Jul 16 • 9

Latent Denoising Makes Good Visual Tokenizers

Paper • 2507.15856 • Published 29 days ago • 9

Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR

Paper • 2507.15778 • Published 29 days ago • 19

The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published about 1 month ago • 84

upvoted 7 papers about 1 month ago

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

Paper • 2507.10532 • Published Jul 14 • 85

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

Paper • 2507.11412 • Published Jul 15 • 25

A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning

Paper • 2507.08267 • Published Jul 11 • 10

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Paper • 2507.10524 • Published Jul 14 • 67

Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Paper • 2507.07955 • Published Jul 10 • 23

SingLoRA: Low Rank Adaptation Using a Single Matrix

Paper • 2507.05566 • Published Jul 8 • 110

Differential Mamba

Paper • 2507.06204 • Published Jul 8 • 19

Jade

AI & ML interests

Recent Activity

Organizations

euclaise's activity