1 4 19

AlphaSue

AI & ML interests

None yet

Recent Activity

liked a Space about 18 hours ago

LLM360/TxT360

liked a model 8 days ago

jinaai/ReaderLM-v2

liked a Space 8 days ago

nanotron/ultrascale-playbook

View all activity

Organizations

None yet

AlphaSue's activity

liked a Space about 18 hours ago

103

TxT360: Trillion Extracted Text

📖

Create a large, deduplicated dataset for LLM pre-training

liked a model 8 days ago

jinaai/ReaderLM-v2

Text Generation • Updated 23 days ago • 27.3k • • 534

liked a Space 8 days ago

1.79k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted an article about 1 month ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 417

upvoted a collection 2 months ago

Papers I've read

Collection

16 items • Updated Jan 12 • 6

liked a dataset 2 months ago

microsoft/RedStone

Updated Dec 5, 2024 • 76 • 33

liked a model 2 months ago

open-web-math/filtering-models

Updated Nov 2, 2023 • 9

liked a dataset 2 months ago

m-a-p/FineFineWeb

Viewer • Updated Dec 19, 2024 • 4.89B • 1.15M • 37

upvoted a paper 4 months ago

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Paper • 2410.12784 • Published Oct 16, 2024 • 46

New activity in jinaai/reader-lm-1.5b 4 months ago

Temperature and repetition_penalty

#1 opened 6 months ago by

ayyylol

liked 2 models 6 months ago

nvidia/quality-classifier-deberta

Updated 28 days ago • 17.1k • 56

oliverguhr/fullstop-punctuation-multilang-large

Token Classification • Updated Nov 16, 2023 • 286k • • 158

liked a dataset 8 months ago

teknium/OpenHermes-2.5

Viewer • Updated Apr 15, 2024 • 1M • 2.21k • 715

liked a model 9 months ago

Snowflake/snowflake-arctic-embed-m

liked a Space 9 months ago

809

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality web text data for LLM training

liked a dataset 9 months ago

liwu/MNBVC

Updated Aug 23, 2024 • 23.8k • 523

liked 3 datasets 10 months ago

upvoted an article 11 months ago

Article

Large-scale Near-deduplication Behind BigCode

May 16, 2023

• 21