michael ilie PRO

skdrx

AI & ML interests

None yet

Recent Activity

upvoted a collection 1 day ago
Qwen2.5-VL
updated a model 4 days ago
skdrx/gemma-2-2b-finemath-finetune
published a model 4 days ago
skdrx/gemma-2-2b-finemath-finetune
View all activity

Organizations

Parallel Software and Systems Group's profile picture Prompt Systematic Review's profile picture Lumo Imaging's profile picture

skdrx's activity

reacted to etemiz's post with πŸ‘€ 15 days ago
view post
Post
1757
-= DeepSeek V3 =-

After installing the new CUDA toolkit and compiling llama.cpp again I tested DeepSeek V3 yesterday.

In terms of human alignment DeepSeek V3 did worse on:
- health
- fasting
- nostr
- misinfo
- nutrition

did better on:
- faith
- bitcoin
- alternative medicine
- ancient wisdom

compared to DeepSeek 2.5. In my opinion overall it is worse than 2.5. And 2.5 wasn't that great.

There is a general tendency of models getting smarter but at the same time getting less wiser, less human aligned, less beneficial to humans.

I don't know what is causing this. But maybe synthetic dataset use for further training the LLMs makes it more and more detached from humanity. This is not going in the right direction.

My solution is to come up with a curator council to determine the datasets that are closest to human preference. "Humans that care about other humans the most" could be a definition of this dataset. What do you think?
  • 3 replies
Β·
upvoted an article about 2 months ago
view article
Article

πŸΊπŸ¦β€β¬› LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

By wolfram β€’
β€’ 76
New activity in skdrx/amd135m_reasoning_finetune 4 months ago

Model performance

1
#1 opened 4 months ago by
skdrx