-
Aligning Instruction Tuning with Pre-training
Paper β’ 2501.09368 β’ Published -
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Paper β’ 2403.14608 β’ Published -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper β’ 2305.18290 β’ Published β’ 55
ROHITH VENKATA REDDY
knight7561
AI & ML interests
Deep learning, Autonomous Driving
Recent Activity
liked
a Space
3 days ago
nanotron/ultrascale-playbook
new activity
3 days ago
nanotron/ultrascale-playbook:Questions?
Organizations
Collections
2
Papers dump of LLM Reasoning domain
-
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Paper β’ 2407.14507 β’ Published β’ 47 -
Large Language Models are Zero-Shot Reasoners
Paper β’ 2205.11916 β’ Published β’ 1 -
Let's Verify Step by Step
Paper β’ 2305.20050 β’ Published β’ 10 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper β’ 2201.11903 β’ Published β’ 11
spaces
2
models
5
knight7561/SmolLM2_python_coder-FT-ORPO
Text Generation
β’
Updated
β’
8
knight7561/SmolLM2-FT-DPO-python-code
Text Generation
β’
Updated
β’
10
knight7561/SmolLM2_python_coder
Text Generation
β’
Updated
β’
24
knight7561/SmolLM2-eli5_precomputed_top_slice
Text Generation
β’
Updated
β’
9
knight7561/SmolLM2-FT-MyDataset
Text Generation
β’
Updated
β’
9
datasets
None public yet