7 16 169

ChuGyouk PRO

ChuGyouk

https://gyoukchu.vercel.app/

GyoukChu

AI & ML interests

Interested in how to escape from GPU poor

Recent Activity

upvoted a paper 2 days ago

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

new activity 2 days ago

junnei/ko-limo:Critical: Translation Quality Evaluation Required

liked a model 3 days ago

imsanjoykb/deepSQL-R1-distill-8B

View all activity

Organizations

None yet

ChuGyouk's activity

upvoted a paper 2 days ago

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published 8 days ago • 42

New activity in junnei/ko-limo 2 days ago

Critical: Translation Quality Evaluation Required

#2 opened 2 days ago by

ChuGyouk

liked a model 3 days ago

imsanjoykb/deepSQL-R1-distill-8B

Text Generation • Updated 25 days ago • 293 • 3

liked 2 models 7 days ago

deepseek-ai/deepseek-vl2

Image-Text-to-Text • Updated Dec 18, 2024 • 19.1k • 289

perplexity-ai/r1-1776

Text Generation • Updated 1 day ago • 31.9k • • 1.88k

liked a dataset 7 days ago

facebook/natural_reasoning

Viewer • Updated 7 days ago • 1.15M • 3.78k • 248

liked 2 models 7 days ago

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated 13 days ago • 2.01M • 579

Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • Updated 13 days ago • 264k • 336

upvoted a paper 7 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 9 days ago • 148

liked a dataset 10 days ago

Anthropic/persuasion

Viewer • Updated Apr 9, 2024 • 3.94k • 581 • 183

liked a dataset 11 days ago

junnei/ko-limo

Viewer • Updated 16 days ago • 817 • 137 • 12

liked a model 11 days ago

microsoft/OmniParser-v2.0

Image-Text-to-Text • Updated 10 days ago • 6.73k • 1.03k

liked a dataset 17 days ago

Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B

Viewer • Updated Jan 27 • 250k • 6.17k • 85

reacted to s-emanuilov's post with 🔥 17 days ago

Post

5163

Tutorial 💥 Training a non-English reasoning model with GRPO and Unsloth

I wanted to share my experiment with training reasoning models in languages other than English/Chinese.

Using Llama 3.1 8B as base, GRPO trainer from trl, and Unsloth optimizations, I got a working prototype in Bulgarian after ~5 hours on an L40S GPU. The approach should work for any language where the base model has some pre-training coverage.

Full code and tutorial here: https://unfoldai.com/reasoning-in-a-non-english-language/

The model itself: s-emanuilov/LLMBG-Llama-3.1-8B-BG-Reasoning-v0.1

I hope this helps anyone looking to build reasoning models in their language.