Anton Lozhkov's picture

Anton Lozhkov

anton-l

·

AI & ML interests

Generative Models, Distributed Training, Photo and Video Enhancement

Recent Activity

upvoted an article about 1 month ago

SmolLM3: smol, multilingual, long-context reasoner

updated a model about 1 month ago

HuggingFaceTB/SmolLM3-3B-Base

published an article about 1 month ago

SmolLM3: smol, multilingual, long-context reasoner

View all activity

Organizations

Posts 1

Post

3173

Introducing 📐𝐅𝐢𝐧𝐞𝐌𝐚𝐭𝐡: the best public math pre-training dataset with 50B+ tokens!
HuggingFaceTB/finemath

Math remains challenging for LLMs and by training on FineMath we see considerable gains over other math datasets, especially on GSM8K and MATH.

We build the dataset by:
🛠️ carefully extracting math data from Common Crawl;
🔎 iteratively filtering and recalling high quality math pages using a classifier trained on synthetic annotations to identify math reasoning and deduction.

We conducted a series of ablations comparing the performance of Llama-3.2-3B-Base after continued pre-training on FineMath and observe notable gains compared to the baseline model and other public math datasets.

We hope this helps advance the performance of LLMs on math and reasoning! 🚀
We’re also releasing all the ablation models as well as the evaluation code.

HuggingFaceTB/finemath-6763fb8f71b6439b653482c2

Articles 7

Article

622

SmolLM3: smol, multilingual, long-context reasoner

View all Articles

Papers 6

arxiv:2504.05299

arxiv:2502.02737

arxiv:2406.17557

arxiv:2402.19173

spaces 3

Kinda-English ruDALL-E

Html Parser Viz

YouTube Streaming ASR

models 70

anton-l/bert_snowflake_regression

Updated May 6, 2024

anton-l/ddpm-butterflies-128

Updated Aug 3, 2023 • 161 • 9

anton-l/ddpm-butterflies-128-test

Updated Jan 11, 2023

anton-l/dream-sna2

Text-to-Image • Updated Jan 8, 2023 • 3

anton-l/dream-sna

Text-to-Image • Updated Jan 8, 2023 • 3

anton-l/ddpm-ema-flowers-64-2gpu

Updated Jan 5, 2023 • 3

anton-l/ddpm-ema-flowers-64-testt

Updated Dec 19, 2022 • 5

anton-l/wav2vec2-base-superb-sv

Audio Classification • Updated Nov 11, 2022 • 223 • 3

anton-l/ddpm-ema-flowers-64-test

Updated Oct 27, 2022 • 6

anton-l/gpt-j-tiny-random

Text Generation • Updated Oct 24, 2022 • 1.94k • 1

datasets 24

anton-l/superb_demo

Viewer • Updated Jun 20 • 32 • 2.49k • 1

anton-l/superb_dummy

Viewer • Updated Jun 19 • 95 • 803

anton-l/superb

Updated Sep 10, 2024 • 175 • 1

anton-l/fw_edu_200k_3_clusters

Viewer • Updated Aug 11, 2024 • 100k • 37

anton-l/dclm_edu_200k_clusters

Viewer • Updated Aug 11, 2024 • 100k • 19

anton-l/stanford_prompts_1M_rag

Viewer • Updated Mar 28, 2024 • 50k • 27 • 2

anton-l/math_fw_sample

Viewer • Updated Mar 21, 2024 • 46

anton-l/wiki-embed-mxbai-embed-large-v1

Viewer • Updated Mar 19, 2024 • 19.4M • 295

anton-l/wiki-chunked-mxbai-embed-large-v1

Viewer • Updated Mar 18, 2024 • 2.64M • 167

anton-l/wiki_embeddings

Viewer • Updated Mar 15, 2024 • 58.7k • 11

View 24 datasets