Commit
Β·
7bb5b6f
verified
Β·
0
Parent(s):
Duplicate from NeuraLake/iSA-02-NoTags-GGUF
Browse files- .gitattributes +11 -0
- README.md +175 -0
- iSA-02-Nano-1B-NoTags.F16.gguf +3 -0
- iSA-02-Nano-1B-NoTags.F32.gguf +3 -0
- iSA-02-Nano-1B-NoTags.Q4_0.gguf +3 -0
- iSA-02-Nano-1B-NoTags.Q4_1.gguf +3 -0
- iSA-02-Nano-1B-NoTags.Q4_K_M.gguf +3 -0
- iSA-02-Nano-1B-NoTags.Q5_0.gguf +3 -0
- iSA-02-Nano-1B-NoTags.Q5_1.gguf +3 -0
- iSA-02-Nano-1B-NoTags.Q5_K_M.gguf +3 -0
- iSA-02-Nano-1B-NoTags.Q6_K.gguf +3 -0
- iSA-02-Nano-1B-NoTags.Q8_0.gguf +3 -0
.gitattributes
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
iSA-02-Nano-1B-NotTags.F16.gguf filter=lfs diff=lfs merge=lfs -text
|
2 |
+
iSA-02-Nano-1B-NoTags.F16.gguf filter=lfs diff=lfs merge=lfs -text
|
3 |
+
iSA-02-Nano-1B-NoTags.F32.gguf filter=lfs diff=lfs merge=lfs -text
|
4 |
+
iSA-02-Nano-1B-NoTags.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
|
5 |
+
iSA-02-Nano-1B-NoTags.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
|
6 |
+
iSA-02-Nano-1B-NoTags.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
7 |
+
iSA-02-Nano-1B-NoTags.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
|
8 |
+
iSA-02-Nano-1B-NoTags.Q5_1.gguf filter=lfs diff=lfs merge=lfs -text
|
9 |
+
iSA-02-Nano-1B-NoTags.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
10 |
+
iSA-02-Nano-1B-NoTags.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
11 |
+
iSA-02-Nano-1B-NoTags.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,175 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- text-generation-inference
|
4 |
+
- transformers
|
5 |
+
- facebook
|
6 |
+
- meta
|
7 |
+
- pytorch
|
8 |
+
- gguf
|
9 |
+
- reasoning
|
10 |
+
- context-dynamic
|
11 |
+
- small-models
|
12 |
+
- synthetic-data
|
13 |
+
- function-calls
|
14 |
+
- open-source
|
15 |
+
- llama
|
16 |
+
- NeuraLake
|
17 |
+
- π§π·
|
18 |
+
- 256K
|
19 |
+
license: apache-2.0
|
20 |
+
model_creator: Celso H A Diniz
|
21 |
+
model_name: iSA-02-Nano-1B-Preview
|
22 |
+
---
|
23 |
+
|
24 |
+
**β οΈ Experimental Release Notice:**
|
25 |
+
This model is in an **experimental phase** on Hugging Face and is **still undergoing training**. Expect further enhancements and updates in the coming week.
|
26 |
+
|
27 |
+
# NeuraLake iSA-02 Series: Advanced Small-Scale Reasoning Models
|
28 |
+
|
29 |
+
## Overview
|
30 |
+
|
31 |
+
The **NeuraLake iSA-02 Series** comprises compact reasoning models optimized for efficient logical processing in resource-constrained environments. Designed for applications requiring nuanced decision-making and complex problem-solving, these models balance performance with computational efficiency.
|
32 |
+
|
33 |
+
## Release Information
|
34 |
+
|
35 |
+
Model weights for each variant (1B, 2B, 3B, and 7B parameters) will be released post comprehensive training and optimization to ensure high performance and safety standards.
|
36 |
+
|
37 |
+
# iSA-02-Nano-1B-Preview (**No Structured Tags Variant**)
|
38 |
+
|
39 |
+
The **iSA-02-Nano-1B-Preview** is the latest addition to the iSA-02 series, enhanced with synthetic data to prioritize **βthinking before speaking.β** This focus enhances its reasoning capabilities, making it ideal for applications requiring thoughtful and logical text generation within a compact framework.
|
40 |
+
|
41 |
+
### What is a Reasoning Model?
|
42 |
+
|
43 |
+
A **reasoning model** simulates human-like logical thinking, enabling the analysis of information, inference drawing, and decision-making based on data. Unlike traditional language models that generate text from patterns, reasoning models excel in understanding, planning, and executing multi-step processes.
|
44 |
+
|
45 |
+

|
46 |
+
|
47 |
+
|
48 |
+
|
49 |
+
### Name and Inspiration
|
50 |
+
|
51 |
+
- **iSA:** Stands for **Intelligent, Small, Autonomous**, reflecting the mission to create compact AI systems with adaptive and intelligent behavior.
|
52 |
+
- **Development:** Initiated in January 2024, the series emerged from experiments combining diverse datasets, revealing initial reasoning capabilities in the base model. Unlike models derived from OpenAI, iSA-02 emphasizes unique reasoning enhancements through innovative synthetic data and contextual refinement.
|
53 |
+
|
54 |
+
### Lineage
|
55 |
+
|
56 |
+
Based on **[meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)** and refined with synthetic datasets from **[NeuraLake](https://www.neuralake.com.br)**, the iSA-02-Nano-1B-Preview targets improvements in reasoning, long-context handling, and adaptive behaviors.
|
57 |
+
|
58 |
+
## Key Features
|
59 |
+
|
60 |
+
- **Extended Context Window:** Supports up to **256K tokens** for complex reasoning and Retrieval-Augmented Generation (RAG).
|
61 |
+
- **Adaptive Reasoning:** Adjusts reasoning depth based on context sizeβconcise for <8K tokens and detailed for >16K tokens.
|
62 |
+
- **Efficiency Optimized:** Balances advanced reasoning with low computational demands, suitable for resource-limited settings.
|
63 |
+
|
64 |
+
## Model Specifications
|
65 |
+
|
66 |
+
### Architecture
|
67 |
+
- **Type:** Transformer-based
|
68 |
+
- **Layers:** 16
|
69 |
+
- **Hidden Size:** 2048
|
70 |
+
- **Attention Heads:** 32
|
71 |
+
- **Feed-Forward Size:** 8192
|
72 |
+
- **Vocabulary Size:** 128,256
|
73 |
+
|
74 |
+
### Training Parameters
|
75 |
+
- **Precision:** Mixed Precision (fp16)
|
76 |
+
- **Context Window:**
|
77 |
+
- **Text Generation:** 1,024β4,096 tokens
|
78 |
+
- **Logical Reasoning:** 16,000β64,000 tokens
|
79 |
+
|
80 |
+
### Quantization Versions
|
81 |
+
|
82 |
+
| Version | Format | Bits | Parameters | Download |
|
83 |
+
|---------|-----------------|------|------------|------------------------------------------------------------------------------------------------------|
|
84 |
+
| F32 | Custom Llama 3.2 | FP32 | 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.F32.gguf) |
|
85 |
+
| F16 | Custom Llama 3.2 | FP16 | 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.F16.gguf) |
|
86 |
+
| Q4_0 | Custom Llama 3.2 | 4-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q4_0.gguf) |
|
87 |
+
| Q4_K_M | Custom Llama 3.2 | 4-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q4_K_M.gguf) |
|
88 |
+
| Q5_K_M | Custom Llama 3.2 | 5-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q5_K_M.gguf) |
|
89 |
+
| Q8_0 | Custom Llama 3.2 | 8-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q8_0.gguf) |
|
90 |
+
|
91 |
+
### Hardware Requirements
|
92 |
+
|
93 |
+
| Version | Quantization | Size | Memory (RAM/vRAM) |
|
94 |
+
|---------|--------------|--------|-------------------|
|
95 |
+
| F32 | FP32 | 4.95 GB| 9.9 GB |
|
96 |
+
| F16 | FP16 | 2.48 GB| 4.96 GB |
|
97 |
+
| Q4_0 | 4-bit | 771 MB | 1.56 GB |
|
98 |
+
| Q4_K_M | 4-bit | 808 MB | 1.62 GB |
|
99 |
+
| Q5_K_M | 5-bit | 912 MB | 1.84 GB |
|
100 |
+
| Q8_0 | 8-bit | 1.32 GB| 2.64 GB |
|
101 |
+
|
102 |
+
## Training and Fine-Tuning
|
103 |
+
|
104 |
+
Trained on synthetic datasets tailored to enhance logical reasoning, multi-step task execution, and contextual tool usage, the iSA-02 series ensures robust performance in complex scenarios and adaptive behaviors.
|
105 |
+
|
106 |
+
## Use Cases
|
107 |
+
|
108 |
+
### Applications
|
109 |
+
- **Logical Reasoning & Decision-Making:** Generate analytical reports from system logs.
|
110 |
+
- **Dynamic Tool Integration:** Ideal for long-context RAG tasks like querying large databases.
|
111 |
+
- **Structured Content Generation:** Perfect for correcting OCR outputs and filling in missing data.
|
112 |
+
|
113 |
+
### Limitations
|
114 |
+
- **Unsuitable for:**
|
115 |
+
- High-throughput text generation.
|
116 |
+
- Latency-sensitive applications.
|
117 |
+
- **Challenges:**
|
118 |
+
- Potential biases from synthetic data.
|
119 |
+
- Redundant or verbose reasoning.
|
120 |
+
|
121 |
+
## Improvements in Version 1.1
|
122 |
+
|
123 |
+
- **Enhanced Reasoning:** Faster processing with reduced overthinking.
|
124 |
+
- **Better Tool Utilization:** More effective use of external tools.
|
125 |
+
- **Improved Context Understanding:** Aligns actions with user intentions.
|
126 |
+
- **Reduced Redundancy:** More concise responses.
|
127 |
+
- **Less Task Aversion:** Fewer refusals of routine tasks.
|
128 |
+
- **Optimized Context Management:** Efficient handling of the 256K context window.
|
129 |
+
|
130 |
+
## Best Practices
|
131 |
+
|
132 |
+
### Configuration Recommendations
|
133 |
+
- **max_tokens:**
|
134 |
+
- **Simple Tasks:** 1,024β4,096 tokens
|
135 |
+
- **Complex Tasks:** 8,000β16,000 tokens
|
136 |
+
- **temperature:**
|
137 |
+
- **Objective Responses:** 0.1β0.3
|
138 |
+
- **Creative Reasoning:** 0.7β1.0
|
139 |
+
- **top_p:**
|
140 |
+
- **Focused Outputs:** 0.85
|
141 |
+
- **Precision Tasks:** 0.1
|
142 |
+
- **stop_sequences:**
|
143 |
+
- Use specific sequences like "Therefore, the answer is," to minimize redundancy.
|
144 |
+
|
145 |
+
### Prompt Engineering
|
146 |
+
- **Simple Tasks:**
|
147 |
+
- **Example:** `"You are a helpful assistant."`
|
148 |
+
- **Complex Tasks:**
|
149 |
+
- **Example:** `"Transform OCR outputs into valid JSON, return only the JSON data as output."`
|
150 |
+
- **Structured Reasoning**: "Not apply in "No Structured Tags", as it is not necessary or supported."
|
151 |
+
|
152 |
+
|
153 |
+
|
154 |
+
### Supervision and Monitoring
|
155 |
+
- **Clear Prompts:** Ensure instructions are specific and unambiguous to reduce errors and redundancies.
|
156 |
+
|
157 |
+
## Known Issues (Addressed in V1.1)
|
158 |
+
- **Task Management:** Improved handling of complex tasks and function calls.
|
159 |
+
- **Unusual Behavior:** Reduced instances of unsolicited online searches or autonomous interactions.
|
160 |
+
- **Conversational Redirection:** Enhanced stability in maintaining topic focus.
|
161 |
+
- **Function Call Execution:** Ensured simulated function calls are actionable.
|
162 |
+
|
163 |
+
## Citation
|
164 |
+
|
165 |
+
```bibtex
|
166 |
+
@misc{isa02,
|
167 |
+
author = {NeuraLake},
|
168 |
+
title = {iSA-02: The First Small Reasoning Model with Context-Dynamic Behavior},
|
169 |
+
year = {2024},
|
170 |
+
license = {Apache 2.0},
|
171 |
+
url = {https://huggingface.co/NeuraLake/iSA-02},
|
172 |
+
}
|
173 |
+
```
|
174 |
+
|
175 |
+
**Note:** This model card is under development and will be updated with additional details, evaluation metrics, and the final model name.
|
iSA-02-Nano-1B-NoTags.F16.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1cdb9aa63c9ed49f7a38d3a8c21d1379cb091e893239bdaad6a150be3ecbf275
|
3 |
+
size 2479595776
|
iSA-02-Nano-1B-NoTags.F32.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4a7f386b3f45562d847116612629c07ae644c1bedc3aff1482b89cc25bee4730
|
3 |
+
size 4951089408
|
iSA-02-Nano-1B-NoTags.Q4_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:26272c3fbe7e61d266c0a2c0dc5b2d1e73f39ab4cbb75260bd66d147dec8ae27
|
3 |
+
size 770928896
|
iSA-02-Nano-1B-NoTags.Q4_1.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:feeeb4b883b2bbad87a8b0077fb5927e11e3a2f15ea5995010c3dcc2d62d3e79
|
3 |
+
size 831746304
|
iSA-02-Nano-1B-NoTags.Q4_K_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:539156d277f4b5985bae8ea5d5d02e89a2949678dc63016e84ff57862ff7e5c4
|
3 |
+
size 807694592
|
iSA-02-Nano-1B-NoTags.Q5_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a9e7a0b261a4c1450a0de9fc91cce3adf85d1d8fb3600d92cc5d81273ae6e00d
|
3 |
+
size 892563712
|
iSA-02-Nano-1B-NoTags.Q5_1.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:471633cac0d8aec23dab430c27d77bbd603daa37ff0912d681f16b558a42ff39
|
3 |
+
size 953381120
|
iSA-02-Nano-1B-NoTags.Q5_K_M.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1ca81610883cf3cafc5b606fd3a15dac16b06840a2409f65b7a66a5969b76d10
|
3 |
+
size 911503616
|
iSA-02-Nano-1B-NoTags.Q6_K.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f1adb88b4cd6015909880d3b7ede8b3a9d782a767a26b7ffeceedb492cf66293
|
3 |
+
size 1021800704
|
iSA-02-Nano-1B-NoTags.Q8_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2e1c3583e9d7d600db2962f3aa060c90cf3ca576354b85cb307cdbcd2e6a813b
|
3 |
+
size 1321083136
|