-
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper β’ 2412.13663 β’ Published β’ 125 -
A Survey of Small Language Models
Paper β’ 2410.20011 β’ Published β’ 40 -
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper β’ 2412.11768 β’ Published β’ 41
Matricardi Fabio
FM-1976
AI & ML interests
control system engineering, AI, LLM with python. ThePoorGPUguy on substack
Recent Activity
liked
a model
about 24 hours ago
ggml-org/Qwen2.5-Coder-1.5B-Q8_0-GGUF
liked
a model
1 day ago
deepseek-ai/Janus-Pro-1B
liked
a model
3 days ago
HKUSTAudio/Llasa-3B
Organizations
None yet
Collections
3
spaces
8
models
5
FM-1976/ov_Llama-SmolTalk-3.2-1B-Instruct
Text Generation
β’
Updated
β’
3
FM-1976/ov_NuExtract-1.5-tiny
Text Generation
β’
Updated
β’
4
FM-1976/NuExtract-1.5-tiny-ONNX
Updated
β’
3
FM-1976/gemma-2-2b-it-Q5_K_M-GGUF
Text Generation
β’
Updated
β’
8
β’
1
FM-1976/stablelm-zephyr-3b-openvino-4bit
Updated
β’
8
datasets
None public yet