101M-0.4 / README.md
Ambuj Varshney
Create README.md
130bbe3 verified
|
raw
history blame
1.48 kB
metadata
license: apache-2.0
datasets:
  - HuggingFaceFW/fineweb
language:
  - en
library_name: transformers
tags:
  - IoT
  - sensor
  - embedded

TinyLLM

Overview

This repository hosts a small language model developed as part of the TinyLLM framework ([arxiv link]). These models are specifically designed and fine-tuned with sensor data to support embedded sensing applications. They enable locally hosted language models on low-computing-power devices, such as single-board computers. The models, based on the GPT-2 architecture, are trained using Nvidia's H100 GPUs. This repo provides base models that can be further fine-tuned for specific downstream tasks related to embedded sensing.

Model Information

  • Parameters: 101M (Hidden Size = 704)
  • Architecture: Decoder-only transformer
  • Training Data: Up to 10B tokens from the SHL and Fineweb datasets, combined in a 4:6 ratio
  • Input and Output Modality: Text
  • Context Length: 1024

Acknowledgements

We would like to acknowledge the open-source frameworks llm.c and llama.cpp, which were instrumental in training and testing these models.

Usage

The model can be used in two primary ways:

  1. With Hugging Face’s Transformers Library
  2. With llama.cpp

Disclaimer

This model is intended solely for research purposes.