Ali Bidaran
alibidaran
AI & ML interests
LLMs, Computer Vision, Generative AI, NLP, Machine /Deep learning, Reinforcement Learning
Recent Activity
liked
a dataset
about 14 hours ago
angie-chen55/python-github-code
liked
a dataset
about 14 hours ago
dipesh/python-code-ds-mini
reacted
to
sergiopaniego's
post
with š
14 days ago
Just included example scripts for aligning models using GSPO (including VLM example) šāāļøšāāļø
GSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.
Super-easy-to-get-started example scripts below, GO run them!š©āš»š©āš»
š§āšØ Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.py
š¦ VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.py
š§© More TRL examples: https://huggingface.co/docs/trl/main/en/example_overview
š§āāļø GSPO paper: https://huggingface.co/papers/2507.18071
Organizations
None yet