29 10 35

Saeed

MLDataScientist

AI & ML interests

None yet

Recent Activity

new activity 1 day ago

tomg-group-umd/huginn-0125:Can we quantize the model to GGUF or GPTQ?

new activity 9 days ago

Enturbulate/DeepSeek-v2.5-1210-UD-gguf:Some description with each quant sizes would be nice.

commented on a paper 21 days ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

View all activity

Organizations

None yet

MLDataScientist's activity

New activity in tomg-group-umd/huginn-0125 1 day ago

Can we quantize the model to GGUF or GPTQ?

#10 opened 1 day ago by

MLDataScientist

New activity in Enturbulate/DeepSeek-v2.5-1210-UD-gguf 9 days ago

Some description with each quant sizes would be nice.

#1 opened 9 days ago by

MLDataScientist

commented a paper 21 days ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 28 days ago • 56 •

upvoted a paper 21 days ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 28 days ago • 56

New activity in numen-tech/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview-GPTQ-Int4 22 days ago

Is this MLC LLM quantized or GPTQ?

#1 opened 22 days ago by

MLDataScientist

liked a model about 1 month ago

unsloth/DeepSeek-R1-GGUF

Text Generation • Updated 14 days ago • 4.23M • 931

New activity in deepseek-ai/DeepSeek-R1 about 1 month ago

Hardware requirements?

#19 opened about 1 month ago by

JohnnieB

liked a model about 1 month ago

FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview

Updated Jan 25 • 3.76k • 106

upvoted an article about 1 month ago

Article

FuseO1-Preview: System-II Reasoning Fusion of LLMs

and 4 others •

Jan 20

• 17

upvoted a paper about 1 month ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 63

updated a model about 1 month ago

MLDataScientist/Mistral-Large-Instruct-2407-GPTQ-3bit

Text Generation • Updated Jan 18 • 25

published a model about 1 month ago

MLDataScientist/Mistral-Large-Instruct-2407-GPTQ-3bit

Text Generation • Updated Jan 18 • 25

upvoted an article about 1 month ago

Article

Diving into MiniMax01 405B MoE

•

Jan 15

• 17

New activity in kaitchup/Mistral-Nemo-Base-2407-AutoRound-GPTQ-asym-4bit about 1 month ago

Request for Mistral Large 2 Instruct 2407 3bit with Autoround GPTQ

#1 opened about 1 month ago by

MLDataScientist

liked 3 models about 2 months ago

liked a model 2 months ago

deepseek-ai/DeepSeek-V3

Text Generation • Updated 3 days ago • 3.23M • • 3.56k

upvoted an article 3 months ago

Article

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

•

Dec 4, 2024

• 77

liked a Space 3 months ago

583

AI Video Composer

🏞

Create videos with FFMPEG + Qwen2.5-Coder