Duplicate from NeuraLake/iSA-02-NoTags-GGUF

Browse files

Files changed (12) hide show

.gitattributes +11 -0
README.md +175 -0
iSA-02-Nano-1B-NoTags.F16.gguf +3 -0
iSA-02-Nano-1B-NoTags.F32.gguf +3 -0
iSA-02-Nano-1B-NoTags.Q4_0.gguf +3 -0
iSA-02-Nano-1B-NoTags.Q4_1.gguf +3 -0
iSA-02-Nano-1B-NoTags.Q4_K_M.gguf +3 -0
iSA-02-Nano-1B-NoTags.Q5_0.gguf +3 -0
iSA-02-Nano-1B-NoTags.Q5_1.gguf +3 -0
iSA-02-Nano-1B-NoTags.Q5_K_M.gguf +3 -0
iSA-02-Nano-1B-NoTags.Q6_K.gguf +3 -0
iSA-02-Nano-1B-NoTags.Q8_0.gguf +3 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,11 @@

+iSA-02-Nano-1B-NotTags.F16.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.F16.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.F32.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.Q5_1.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
+iSA-02-Nano-1B-NoTags.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,175 @@

+---
+tags:
+  - text-generation-inference
+  - transformers
+  - facebook
+  - meta
+  - pytorch
+  - gguf
+  - reasoning
+  - context-dynamic
+  - small-models
+  - synthetic-data
+  - function-calls
+  - open-source
+  - llama
+  - NeuraLake
+  - 🇧🇷
+  - 256K
+license: apache-2.0
+model_creator: Celso H A Diniz
+model_name: iSA-02-Nano-1B-Preview
+---
+**⚠️ Experimental Release Notice:**
+This model is in an **experimental phase** on Hugging Face and is **still undergoing training**. Expect further enhancements and updates in the coming week.
+# NeuraLake iSA-02 Series: Advanced Small-Scale Reasoning Models
+## Overview
+The **NeuraLake iSA-02 Series** comprises compact reasoning models optimized for efficient logical processing in resource-constrained environments. Designed for applications requiring nuanced decision-making and complex problem-solving, these models balance performance with computational efficiency.
+## Release Information
+Model weights for each variant (1B, 2B, 3B, and 7B parameters) will be released post comprehensive training and optimization to ensure high performance and safety standards.
+# iSA-02-Nano-1B-Preview (**No Structured Tags Variant**)
+The **iSA-02-Nano-1B-Preview** is the latest addition to the iSA-02 series, enhanced with synthetic data to prioritize **“thinking before speaking.”** This focus enhances its reasoning capabilities, making it ideal for applications requiring thoughtful and logical text generation within a compact framework.
+### What is a Reasoning Model?
+A **reasoning model** simulates human-like logical thinking, enabling the analysis of information, inference drawing, and decision-making based on data. Unlike traditional language models that generate text from patterns, reasoning models excel in understanding, planning, and executing multi-step processes.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/67355d00728f9dcf37212c02/whZHzNAYQ6eGtpjJJlUM6.png)
+### Name and Inspiration
+- **iSA:** Stands for **Intelligent, Small, Autonomous**, reflecting the mission to create compact AI systems with adaptive and intelligent behavior.
+- **Development:** Initiated in January 2024, the series emerged from experiments combining diverse datasets, revealing initial reasoning capabilities in the base model. Unlike models derived from OpenAI, iSA-02 emphasizes unique reasoning enhancements through innovative synthetic data and contextual refinement.
+### Lineage
+Based on **[meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)** and refined with synthetic datasets from **[NeuraLake](https://www.neuralake.com.br)**, the iSA-02-Nano-1B-Preview targets improvements in reasoning, long-context handling, and adaptive behaviors.
+## Key Features
+- **Extended Context Window:** Supports up to **256K tokens** for complex reasoning and Retrieval-Augmented Generation (RAG).
+- **Adaptive Reasoning:** Adjusts reasoning depth based on context size—concise for <8K tokens and detailed for >16K tokens.
+- **Efficiency Optimized:** Balances advanced reasoning with low computational demands, suitable for resource-limited settings.
+## Model Specifications
+### Architecture
+- **Type:** Transformer-based
+- **Layers:** 16
+- **Hidden Size:** 2048
+- **Attention Heads:** 32
+- **Feed-Forward Size:** 8192
+- **Vocabulary Size:** 128,256
+### Training Parameters
+- **Precision:** Mixed Precision (fp16)
+- **Context Window:**
+  - **Text Generation:** 1,024–4,096 tokens
+  - **Logical Reasoning:** 16,000–64,000 tokens
+### Quantization Versions
+| Version | Format          | Bits | Parameters | Download                                                                                             |
+|---------|-----------------|------|------------|------------------------------------------------------------------------------------------------------|
+| F32     | Custom Llama 3.2 | FP32 | 1.24B      | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.F32.gguf) |
+| F16     | Custom Llama 3.2 | FP16 | 1.24B      | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.F16.gguf) |
+| Q4_0    | Custom Llama 3.2 | 4-bit| 1.24B      | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q4_0.gguf)  |
+| Q4_K_M  | Custom Llama 3.2 | 4-bit| 1.24B      | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q4_K_M.gguf)  |
+| Q5_K_M  | Custom Llama 3.2 | 5-bit| 1.24B      | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q5_K_M.gguf)  |
+| Q8_0    | Custom Llama 3.2 | 8-bit| 1.24B      | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q8_0.gguf)  |
+### Hardware Requirements
+| Version | Quantization | Size   | Memory (RAM/vRAM) |
+|---------|--------------|--------|-------------------|
+| F32     | FP32         | 4.95 GB| 9.9 GB            |
+| F16     | FP16         | 2.48 GB| 4.96 GB           |
+| Q4_0    | 4-bit        | 771 MB | 1.56 GB           |
+| Q4_K_M  | 4-bit        | 808 MB | 1.62 GB           |
+| Q5_K_M  | 5-bit        | 912 MB | 1.84 GB           |
+| Q8_0    | 8-bit        | 1.32 GB| 2.64 GB           |
+## Training and Fine-Tuning
+Trained on synthetic datasets tailored to enhance logical reasoning, multi-step task execution, and contextual tool usage, the iSA-02 series ensures robust performance in complex scenarios and adaptive behaviors.
+## Use Cases
+### Applications
+- **Logical Reasoning & Decision-Making:** Generate analytical reports from system logs.
+- **Dynamic Tool Integration:** Ideal for long-context RAG tasks like querying large databases.
+- **Structured Content Generation:** Perfect for correcting OCR outputs and filling in missing data.
+### Limitations
+- **Unsuitable for:**
+  - High-throughput text generation.
+  - Latency-sensitive applications.
+- **Challenges:**
+  - Potential biases from synthetic data.
+  - Redundant or verbose reasoning.
+## Improvements in Version 1.1
+- **Enhanced Reasoning:** Faster processing with reduced overthinking.
+- **Better Tool Utilization:** More effective use of external tools.
+- **Improved Context Understanding:** Aligns actions with user intentions.
+- **Reduced Redundancy:** More concise responses.
+- **Less Task Aversion:** Fewer refusals of routine tasks.
+- **Optimized Context Management:** Efficient handling of the 256K context window.
+## Best Practices
+### Configuration Recommendations
+- **max_tokens:**
+  - **Simple Tasks:** 1,024–4,096 tokens
+  - **Complex Tasks:** 8,000–16,000 tokens
+- **temperature:**
+  - **Objective Responses:** 0.1–0.3
+  - **Creative Reasoning:** 0.7–1.0
+- **top_p:**
+  - **Focused Outputs:** 0.85
+  - **Precision Tasks:** 0.1
+- **stop_sequences:**
+  - Use specific sequences like "Therefore, the answer is," to minimize redundancy.
+### Prompt Engineering
+- **Simple Tasks:**
+  - **Example:** `"You are a helpful assistant."`
+- **Complex Tasks:**
+  - **Example:** `"Transform OCR outputs into valid JSON, return only the JSON data as output."`
+  - **Structured Reasoning**: "Not apply in "No Structured Tags", as it is not necessary or supported."
+### Supervision and Monitoring
+- **Clear Prompts:** Ensure instructions are specific and unambiguous to reduce errors and redundancies.
+## Known Issues (Addressed in V1.1)
+- **Task Management:** Improved handling of complex tasks and function calls.
+- **Unusual Behavior:** Reduced instances of unsolicited online searches or autonomous interactions.
+- **Conversational Redirection:** Enhanced stability in maintaining topic focus.
+- **Function Call Execution:** Ensured simulated function calls are actionable.
+## Citation
+```bibtex
+@misc{isa02,
+  author       = {NeuraLake},
+  title        = {iSA-02: The First Small Reasoning Model with Context-Dynamic Behavior},
+  year         = {2024},
+  license      = {Apache 2.0},
+  url          = {https://huggingface.co/NeuraLake/iSA-02},
+}
+```
+**Note:** This model card is under development and will be updated with additional details, evaluation metrics, and the final model name.

iSA-02-Nano-1B-NoTags.F16.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1cdb9aa63c9ed49f7a38d3a8c21d1379cb091e893239bdaad6a150be3ecbf275
+size 2479595776

iSA-02-Nano-1B-NoTags.F32.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4a7f386b3f45562d847116612629c07ae644c1bedc3aff1482b89cc25bee4730
+size 4951089408

iSA-02-Nano-1B-NoTags.Q4_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:26272c3fbe7e61d266c0a2c0dc5b2d1e73f39ab4cbb75260bd66d147dec8ae27
+size 770928896

iSA-02-Nano-1B-NoTags.Q4_1.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:feeeb4b883b2bbad87a8b0077fb5927e11e3a2f15ea5995010c3dcc2d62d3e79
+size 831746304

iSA-02-Nano-1B-NoTags.Q4_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:539156d277f4b5985bae8ea5d5d02e89a2949678dc63016e84ff57862ff7e5c4
+size 807694592

iSA-02-Nano-1B-NoTags.Q5_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a9e7a0b261a4c1450a0de9fc91cce3adf85d1d8fb3600d92cc5d81273ae6e00d
+size 892563712

iSA-02-Nano-1B-NoTags.Q5_1.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:471633cac0d8aec23dab430c27d77bbd603daa37ff0912d681f16b558a42ff39
+size 953381120

iSA-02-Nano-1B-NoTags.Q5_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1ca81610883cf3cafc5b606fd3a15dac16b06840a2409f65b7a66a5969b76d10
+size 911503616

iSA-02-Nano-1B-NoTags.Q6_K.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f1adb88b4cd6015909880d3b7ede8b3a9d782a767a26b7ffeceedb492cf66293
+size 1021800704

iSA-02-Nano-1B-NoTags.Q8_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2e1c3583e9d7d600db2962f3aa060c90cf3ca576354b85cb307cdbcd2e6a813b
+size 1321083136