|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- josedamico/sugarcane |
|
language: |
|
- en |
|
base_model: |
|
- TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
tags: |
|
- sugarcane |
|
--- |
|
# π± TinyLLaMA-Sugarcane |
|
|
|
Welcome to the *first open-source LLM fine-tuned for sugarcane production*! π§ πΎ |
|
|
|
This model is a fine-tuned version of [`TinyLLaMA`](https://huggingface.co/czi/TinyLlama-1.1B-Chat-v1.0), trained specifically on sugarcane-focused data. Developed by [SciCrop](https://scicrop.com) as part of its commitment to open innovation in agriculture, this is one of the first domain-specific small language models (SLMs) created for the agribusiness sector. |
|
|
|
--- |
|
|
|
## π Why Sugarcane? |
|
|
|
Sugarcane is one of the most important crops in Brazil and globally β but most LLMs know very little about its specific production cycle, challenges, and terminology. |
|
|
|
By fine-tuning TinyLLaMA on 2,000+ question/answer pairs from real-world sugarcane use cases, we aim to deliver: |
|
|
|
- β
Better accuracy |
|
- β
Clearer answers |
|
- β
Local deployment capabilities for agricultural experts, cooperatives, and researchers |
|
|
|
--- |
|
|
|
## π Model Details |
|
|
|
- **Base model**: `TinyLLaMA-1.1B-Chat` |
|
- **Fine-tuned on**: Domain-specific QA pairs related to sugarcane |
|
- **Architecture**: Causal LM with LoRA + QLoRA |
|
- **Tokenizer**: `LLaMATokenizer` |
|
- **Model size**: ~1.1B parameters |
|
- **Format**: Available in both HF standard and `GGUF` for local/Ollama use |
|
|
|
--- |
|
|
|
## π§ͺ Try it locally with Ollama |
|
|
|
We believe local models are the future for privacy-sensitive, domain-specific AI. |
|
|
|
You can run this model locally using [Ollama](https://ollama.com): |
|
|
|
```bash |
|
ollama run infinitestack/tinyllama-sugarcane |
|
``` |
|
|
|
π Or explore the model directly: |
|
https://ollama.com/infinitestack/tinyllama-sugarcane |
|
|
|
--- |
|
|
|
## π About InfiniteStack |
|
|
|
This model is part of **InfiniteStack**, a platform by [SciCrop](https://scicrop.com) that helps companies in the agri-food-energy-environment chain create, train, and deploy their own AI and analytics solutions β securely and at scale. |
|
|
|
### π¦ InfiniteStack offers: |
|
|
|
- A containerized platform that runs on-prem or in private cloud |
|
- Full support for **SLMs and LLMs** using your **real and private data** |
|
- No/Low-code interfaces to *Collect*, *Automate*, *Leverage*, *Catalog*, *Observe*, and *Track* data pipelines and AI assets |
|
|
|
π Learn more: https://infinitestack.ai |
|
|
|
--- |
|
|
|
## π§ Why Small Language Models (SLMs)? |
|
|
|
SLMs are great when: |
|
|
|
- You need local inference (offline, on-device, or private) |
|
- Your domain is narrow and specific |
|
- You want full control over fine-tuning and usage |
|
- You care about speed, size, and cost-efficiency |
|
|
|
Big isnβt always better. Sometimes, smart and focused beats giant and generic. π‘ |
|
|
|
--- |
|
|
|
## π€ Community & Open Innovation |
|
|
|
This work reflects SciCropβs ongoing commitment to the open-source ecosystem, and to creating useful, usable AI for real-world agribusiness. |
|
|
|
Feel free to fork, contribute, fine-tune further, or use it in your own ag project. |
|
Weβd love to hear how you're using it! |
|
|
|
--- |
|
|
|
## π Files included |
|
|
|
This repo includes: |
|
|
|
- `config.json` |
|
- `tokenizer.model` |
|
- `tokenizer.json` |
|
- `model.safetensors` |
|
- `special_tokens_map.json` |
|
- `generation_config.json` |
|
- `tokenizer_config.json` |
|
- `README.md` |
|
|
|
A merged and converted `.gguf` version is also available at **Ollama Hub**. |
|
|
|
--- |
|
|
|
## π¬ Questions or Contributions? |
|
|
|
Ping us at: |
|
π§ [email protected] |
|
π https://scicrop.com |
|
π± https://infinitestack.ai |
|
|
|
Made with β, πΎ and β€οΈ in Brazil |
|
by @josedamico and the InfiniteStack team |