TinyGSM (TinyGSM)

delip

authored 2 papers 3 months ago

Faithful Chain-of-Thought Reasoning

Paper • 2301.13379 • Published Jan 31, 2023

WithdrarXiv: A Large-Scale Dataset for Retraction Study

Paper • 2412.03775 • Published Dec 4, 2024

ClaraBing

authored 3 papers 8 months ago

Exposing Attention Glitches with Flip-Flop Language Modeling

Paper • 2306.00946 • Published Jun 1, 2023 • 2

TinyGSM: achieving >80% on GSM8k with small language models

Paper • 2312.09241 • Published Dec 14, 2023 • 39

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression

Paper • 2306.00788 • Published Jun 1, 2023

sjelassi

authored a paper about 1 year ago

Repeat After Me: Transformers are Better than State Space Models at Copying

Paper • 2402.01032 • Published Feb 1, 2024 • 24

ClaraBing

updated a dataset about 1 year ago

TinyGSM/TinyGSM

Viewer • Updated Jan 11, 2024 • 11.8M • 249 • 6

alexli

authored a paper over 1 year ago

Your Diffusion Model is Secretly a Zero-Shot Classifier

Paper • 2303.16203 • Published Mar 28, 2023

abhishekpanigrahi

authored a paper over 1 year ago

Task-Specific Skill Localization in Fine-tuned Language Models

Paper • 2302.06600 • Published Feb 13, 2023

YuchenLi01

authored a paper over 1 year ago

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

Paper • 2303.04245 • Published Mar 7, 2023

sjelassi

authored a paper over 1 year ago

Length Generalization in Arithmetic Transformers

Paper • 2306.15400 • Published Jun 27, 2023 • 4

TinyGSM

AI & ML interests

TinyGSM's activity

Faithful Chain-of-Thought Reasoning

WithdrarXiv: A Large-Scale Dataset for Retraction Study

Exposing Attention Glitches with Flip-Flop Language Modeling

TinyGSM: achieving >80% on GSM8k with small language models

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression

Repeat After Me: Transformers are Better than State Space Models at Copying

TinyGSM/TinyGSM

Your Diffusion Model is Secretly a Zero-Shot Classifier

Task-Specific Skill Localization in Fine-tuned Language Models

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

Length Generalization in Arithmetic Transformers

AI & ML interests

Team members 7

TinyGSM's activity