Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture

mkurman's activity

reacted to nicolay-r's post with ๐Ÿ”ฅ 2 days ago
view post
Post
1538
๐Ÿ“ข The LLaMA-3.1-8B distilled 8B version of the R1 DeepSeek AI is available besides the one based on Qwen

๐Ÿ“™ Notebook for using it in reasoning over series of data ๐Ÿง  :
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_llama3.ipynb

Loading using the pipeline API of the transformers library:
https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_llama.py
๐ŸŸก GPU Usage: 12.3 GB (FP16/FP32 mode) which is suitable for T4. (a 1.5 GB less than Qwen-distilled version)
๐ŸŒ Perfomance: T4 instance: ~0.19 tokens/sec (FP32 mode) and (FP16 mode) ~0.22-0.30 tokens/sec. Is it should be that slow? ๐Ÿค”
Model name: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
โญ Framework: https://github.com/nicolay-r/bulk-chain
๐ŸŒŒ Notebooks and models hub: https://github.com/nicolay-r/nlp-thirdgate
reacted to fuzzy-mittenz's post with ๐Ÿ˜Ž๐Ÿค—๐Ÿ‘€๐Ÿ”ฅ๐Ÿš€โค๏ธ 3 days ago
view post
Post
2517
Not many seemed to notice but what was probably meant to be a WIN for artist's rights in the US Office of Copyright has solved some fundamental issues for the community.
In our recent article I outline how Companies like Suno, OpenAI, Midjourney etc can no longer claim any right to copy your work that you create with their platforms
We also look at other ways this study and new rules for AI will fundamentally effect creators who use it and companies incentives to give them control over certain aspects might change because of this. it's broken down pretty well here: https://huggingface.co/blog/fuzzy-mittenz/copyright-in-ai
replied to Jaward's post 3 days ago
view reply

Yeah, the fun part is that I use any QA dataset in GRPO just by instructing a model to follow simple rules. Place your answer in \boxed{} or ** ** tags. I do a regex, and it simply works.

reacted to Jaward's post with ๐Ÿ”ฅ 3 days ago
view post
Post
1396
The beauty in GRPO is the fact that it doesnโ€™t care if the rewards are rule-based or learned, the hack: let the data self-normalizeโ€” trajectories in a batch compete against their mean, no value model, no extra params, just clean, efficient RL that cuts memory usage by 50%, while maintaining SOTA performance. btw it was introduced 9months prior to R1: arxiv.org/pdf/2402.03300
  • 1 reply
ยท
replied to their post 3 days ago
view reply

For blurred thoughts, you must set labels equal to ignore_index (-100); that should be enough! For BT SFT, I used normal CrossEntropy loss, but a critique/reward model can also be a good idea! Check this paper: https://arxiv.org/abs/2501.17703

posted an update 5 days ago
view post
Post
1944
Blurred-Thoughts Supervised Fine-Tuning (BT-SFT) ๐Ÿค–

Can we teach a model to think completely on its own without reinforcement learning? Actually, yes.

We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective?

We observed that various models differ in their thinking processes, and fine-tuning one model on another modelโ€™s thoughts (CoT) can sometimes be inefficientโ€”often resulting in the model simply memorizing reasoning rather than learning how to actually think.

I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer.

To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset.

Enjoy! ๐Ÿš€

PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.
  • 3 replies
ยท
reacted to Abhaykoul's post with ๐Ÿ‘€ 5 days ago
view post
Post
3663
๐Ÿ”ฅ THE WAIT IS OVER... HAI-SER IS HERE! ๐Ÿ”ฅ

Yo fam, this ain't just another AI dropโ€” this is the FUTURE of emotional intelligence! ๐Ÿš€

Introducing HAI-SER, powered by Structured Emotional Reasoning (SER), the next-level AI that doesnโ€™t just understand your wordsโ€”it feels you, analyzes your emotions, and helps you navigate lifeโ€™s toughest moments. ๐Ÿ’ก

๐Ÿ’ฅ What makes HAI-SER a game-changer?
๐Ÿ”น Emotional Vibe Check โ€“ Gets the mood, energy, and whatโ€™s really going on ๐ŸŽญ
๐Ÿ”น Mind-State Analysis โ€“ Breaks down your thoughts, beliefs, and patterns ๐Ÿคฏ
๐Ÿ”น Root Cause Deep-Dive โ€“ Unpacks the WHY behind your emotions ๐Ÿ’ก
๐Ÿ”น Impact Check โ€“ Sees how itโ€™s affecting your life and mental health ๐Ÿ’”
๐Ÿ”น Safety Check โ€“ Prioritizes your well-being and crisis management ๐Ÿšจ
๐Ÿ”น Healing Game Plan โ€“ Custom strategies to help you bounce back ๐Ÿ’ช
๐Ÿ”น Growth Potential โ€“ Turns struggles into opportunities for self-improvement ๐Ÿ“ˆ
๐Ÿ”น How to Approach โ€“ Teaches you and others how to communicate and heal ๐Ÿค
๐Ÿ”น Personalized Response โ€“ Not just generic adviceโ€”real talk, tailored to YOU ๐Ÿ’ฏ

No more robotic AI responses. No more surface-level advice. HAI-SER gets deep, analyzing emotions with precision and giving real, actionable support.

This ainโ€™t just AIโ€”this is your digital therapist, life coach, and hype squad all in one. Whether itโ€™s mental health, career struggles, relationships, or personal growth, HAI-SER has your back.

๐Ÿš€ The future of emotionally intelligent AI is HERE.
Are you ready? ๐Ÿ”ฅ๐Ÿ’ฏ

HelpingAI/HAI-SER
ยท
reacted to singhsidhukuldeep's post with ๐Ÿš€ 5 days ago
view post
Post
1637
Groundbreaking Research Alert: Can Large Language Models Really Understand Personal Preferences?

A fascinating new study from researchers at University of Notre Dame, Xi'an Jiaotong University, and Universitรฉ de Montrรฉal introduces PERRECBENCH - a novel benchmark for evaluating how well Large Language Models (LLMs) understand user preferences in recommendation systems.

Key Technical Insights:
- The benchmark eliminates user rating bias and item quality factors by using relative ratings and grouped ranking approaches
- Implements three distinct ranking methods: pointwise rating prediction, pairwise comparison, and listwise ranking
- Evaluates 19 state-of-the-art LLMs including Claude-3.5, GPT-4, Llama-3, Mistral, and Qwen models
- Uses Kendall's tau correlation to measure ranking accuracy
- Incorporates BM25 retriever with configurable history items (k=4 by default)

Notable Findings:
- Current LLMs struggle with true personalization, achieving only moderate correlation scores
- Larger models don't always perform better - challenging conventional scaling laws
- Pairwise and listwise ranking methods outperform pointwise approaches
- Open-source models like Mistral-123B and Llama-3-405B compete well with proprietary models
- Weight merging strategy shows promise for improving personalization capabilities

The research reveals that while LLMs excel at many tasks, they still face significant challenges in understanding individual user preferences. This work opens new avenues for improving personalized recommendation systems and highlights the importance of developing better evaluation methods.

A must-read for anyone interested in LLMs, recommender systems, or personalization technology. The team has made their benchmark and code publicly available for further research.
reacted to nicolay-r's post with ๐Ÿ‘€ 5 days ago
view post
Post
1439
๐Ÿšจ If you want a quickly apply various reasoning techniques ๐Ÿง  for your dataset, then I am happy to save your time and introduce ๐ŸŒŒ nlp-thirdgate ๐ŸŒŒ

https://github.com/nicolay-r/nlp-thirdgate

This is a hub of a third-party providers like OpenAI, Replicate, OpenRouter, Hugging Face ๐Ÿค— Transformers to be used for varions NLP tasks in a no-string mode. So that, you decide which dependecies to install, which I personally see is handy for:
๐Ÿ“™ quick scripts deployment in notebooks like Google Colab;
๐Ÿ“ฆ empowering existing apps with machnine learning;

๐Ÿ“ท The example below demonstrates on how to quick start with reasoning over rows of CSV / JSONL data.

To quick start, all you have to do is to download one of the provider and pass it to the script as shown in the image below.
๐ŸŒŸ Powered by bulk-chain: https://github.com/nicolay-r/bulk-chain
reacted to csabakecskemeti's post with ๐Ÿ”ฅ 6 days ago
reacted to Bils's post with ๐Ÿ”ฅ 6 days ago
view post
Post
1997
๐Ÿš€ We're excited to share major improvements to our Janus-Pro-7B Text-to-Image Generation Space!
๐ŸŽจWhat's New:
1-Critical Bug Fixes
2-Enhanced Features
3-UI Improvements
4-Performance Boost
Try It Now:
Bils/DeepseekJanusPro-Image
replied to their post 6 days ago