Mark Washington's picture

Mark Washington

Mdubbya

·

AI & ML interests

None yet

Recent Activity

liked a model about 9 hours ago

allenai/OLMo-2-0325-32B

liked a dataset about 9 hours ago

EleutherAI/proof-pile-2

liked a model about 10 hours ago

sesame/csm-1b

View all activity

Organizations

None yet

Mdubbya's activity

upvoted 2 articles 1 day ago

Article

Dynamic Intuition-Based Reasoning: A Novel Approach Toward Artificial General Intelligence

By

•

1 day ago

• 2

Article

Benchmarking Assisted Generation with Gemma 3 and Qwen 2.5: A Code-First Guide

By

•

2 days ago

• 1

upvoted an article 3 days ago

Article

Open R1: Update #3

By

and 9 others •

3 days ago

• 207

upvoted a collection 7 days ago

Open LLM Leaderboard best models ❤️‍🔥

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated about 11 hours ago • 555

upvoted a collection 10 days ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 14 items • Updated 3 days ago • 102

upvoted an article 14 days ago

Article

Common AI Model Formats

By

•

15 days ago

• 29

upvoted a collection 15 days ago

Granite 3.2 Language Models

3 items • Updated 16 days ago • 14

upvoted an article 22 days ago

Article

WTF is Fine-Tuning? (intro4devs) | [2025]

By

•

26 days ago

• 6

upvoted a collection 25 days ago

Long Context - 16k,32k,64k,128k,200k,256k,512k,1000k

Q6/Q8 models here. Mixtrals/Mistral (and merges) generally have 32k context (not listed here) . Please see org model card for usage / templates. • 71 items • Updated about 8 hours ago • 12

upvoted 2 collections about 1 month ago

Open-source speech datasets annotated using Data-Speech

Open-source annotated speech datasets ranging from 1,000 hours to 45,000 hours. • 11 items • Updated Aug 8, 2024 • 5

DeepSeek-R1-ReDistill

Re-distilled DeepSeek R1 models • 4 items • Updated Jan 30 • 14

upvoted 2 collections 3 months ago

QwQ-abliterate

5 items • Updated 7 days ago • 6

Sana

⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated Feb 10 • 88

upvoted a collection 4 months ago

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated about 19 hours ago • 75

upvoted a collection 7 months ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 359

upvoted 3 collections 8 months ago

K2

K2, LLM360's most powerful, scaled model series. • 7 items • Updated Oct 7, 2024 • 10

DeepSeekCoder-V2

6 items • Updated Sep 5, 2024 • 93

InternLM2.5

14 items • Updated Feb 11 • 71

upvoted 2 collections 9 months ago

CodeGemma Release

18 items • Updated 2 days ago • 81

Gemma release

Groups the Gemma models released by the Google team. • 40 items • Updated 2 days ago • 330