-
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper β’ 2412.13663 β’ Published β’ 103 -
A Survey of Small Language Models
Paper β’ 2410.20011 β’ Published β’ 40 -
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper β’ 2412.11768 β’ Published β’ 40
Matricardi Fabio
FM-1976
AI & ML interests
control system engineering, AI, LLM with python. ThePoorGPUguy on substack
Recent Activity
updated
a collection
5 days ago
PAPERS
updated
a collection
5 days ago
PAPERS
updated
a collection
5 days ago
Playgrounds
Organizations
None yet
Collections
3
spaces
7
models
5
FM-1976/ov_Llama-SmolTalk-3.2-1B-Instruct
Text Generation
β’
Updated
β’
16
FM-1976/ov_NuExtract-1.5-tiny
Text Generation
β’
Updated
β’
18
FM-1976/NuExtract-1.5-tiny-ONNX
Updated
β’
19
FM-1976/gemma-2-2b-it-Q5_K_M-GGUF
Text Generation
β’
Updated
β’
16
β’
1
FM-1976/stablelm-zephyr-3b-openvino-4bit
Updated
β’
7
datasets
None public yet