-
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 108 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 28 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 18
Shreyas S K
skshreyas714
·
AI & ML interests
NLP, NLU, NLI
Recent Activity
updated
a collection
about 8 hours ago
Read-up research papers
updated
a dataset
13 days ago
skshreyas714/custom_guardrails_dataset
updated
a collection
17 days ago
Read-up research papers
Organizations
None yet