NeuraLake commited on
Commit
7bb5b6f
Β·
verified Β·
0 Parent(s):

Duplicate from NeuraLake/iSA-02-NoTags-GGUF

Browse files
.gitattributes ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ iSA-02-Nano-1B-NotTags.F16.gguf filter=lfs diff=lfs merge=lfs -text
2
+ iSA-02-Nano-1B-NoTags.F16.gguf filter=lfs diff=lfs merge=lfs -text
3
+ iSA-02-Nano-1B-NoTags.F32.gguf filter=lfs diff=lfs merge=lfs -text
4
+ iSA-02-Nano-1B-NoTags.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
5
+ iSA-02-Nano-1B-NoTags.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
6
+ iSA-02-Nano-1B-NoTags.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
7
+ iSA-02-Nano-1B-NoTags.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
8
+ iSA-02-Nano-1B-NoTags.Q5_1.gguf filter=lfs diff=lfs merge=lfs -text
9
+ iSA-02-Nano-1B-NoTags.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
10
+ iSA-02-Nano-1B-NoTags.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
11
+ iSA-02-Nano-1B-NoTags.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text-generation-inference
4
+ - transformers
5
+ - facebook
6
+ - meta
7
+ - pytorch
8
+ - gguf
9
+ - reasoning
10
+ - context-dynamic
11
+ - small-models
12
+ - synthetic-data
13
+ - function-calls
14
+ - open-source
15
+ - llama
16
+ - NeuraLake
17
+ - πŸ‡§πŸ‡·
18
+ - 256K
19
+ license: apache-2.0
20
+ model_creator: Celso H A Diniz
21
+ model_name: iSA-02-Nano-1B-Preview
22
+ ---
23
+
24
+ **⚠️ Experimental Release Notice:**
25
+ This model is in an **experimental phase** on Hugging Face and is **still undergoing training**. Expect further enhancements and updates in the coming week.
26
+
27
+ # NeuraLake iSA-02 Series: Advanced Small-Scale Reasoning Models
28
+
29
+ ## Overview
30
+
31
+ The **NeuraLake iSA-02 Series** comprises compact reasoning models optimized for efficient logical processing in resource-constrained environments. Designed for applications requiring nuanced decision-making and complex problem-solving, these models balance performance with computational efficiency.
32
+
33
+ ## Release Information
34
+
35
+ Model weights for each variant (1B, 2B, 3B, and 7B parameters) will be released post comprehensive training and optimization to ensure high performance and safety standards.
36
+
37
+ # iSA-02-Nano-1B-Preview (**No Structured Tags Variant**)
38
+
39
+ The **iSA-02-Nano-1B-Preview** is the latest addition to the iSA-02 series, enhanced with synthetic data to prioritize **β€œthinking before speaking.”** This focus enhances its reasoning capabilities, making it ideal for applications requiring thoughtful and logical text generation within a compact framework.
40
+
41
+ ### What is a Reasoning Model?
42
+
43
+ A **reasoning model** simulates human-like logical thinking, enabling the analysis of information, inference drawing, and decision-making based on data. Unlike traditional language models that generate text from patterns, reasoning models excel in understanding, planning, and executing multi-step processes.
44
+
45
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/67355d00728f9dcf37212c02/whZHzNAYQ6eGtpjJJlUM6.png)
46
+
47
+
48
+
49
+ ### Name and Inspiration
50
+
51
+ - **iSA:** Stands for **Intelligent, Small, Autonomous**, reflecting the mission to create compact AI systems with adaptive and intelligent behavior.
52
+ - **Development:** Initiated in January 2024, the series emerged from experiments combining diverse datasets, revealing initial reasoning capabilities in the base model. Unlike models derived from OpenAI, iSA-02 emphasizes unique reasoning enhancements through innovative synthetic data and contextual refinement.
53
+
54
+ ### Lineage
55
+
56
+ Based on **[meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)** and refined with synthetic datasets from **[NeuraLake](https://www.neuralake.com.br)**, the iSA-02-Nano-1B-Preview targets improvements in reasoning, long-context handling, and adaptive behaviors.
57
+
58
+ ## Key Features
59
+
60
+ - **Extended Context Window:** Supports up to **256K tokens** for complex reasoning and Retrieval-Augmented Generation (RAG).
61
+ - **Adaptive Reasoning:** Adjusts reasoning depth based on context sizeβ€”concise for <8K tokens and detailed for >16K tokens.
62
+ - **Efficiency Optimized:** Balances advanced reasoning with low computational demands, suitable for resource-limited settings.
63
+
64
+ ## Model Specifications
65
+
66
+ ### Architecture
67
+ - **Type:** Transformer-based
68
+ - **Layers:** 16
69
+ - **Hidden Size:** 2048
70
+ - **Attention Heads:** 32
71
+ - **Feed-Forward Size:** 8192
72
+ - **Vocabulary Size:** 128,256
73
+
74
+ ### Training Parameters
75
+ - **Precision:** Mixed Precision (fp16)
76
+ - **Context Window:**
77
+ - **Text Generation:** 1,024–4,096 tokens
78
+ - **Logical Reasoning:** 16,000–64,000 tokens
79
+
80
+ ### Quantization Versions
81
+
82
+ | Version | Format | Bits | Parameters | Download |
83
+ |---------|-----------------|------|------------|------------------------------------------------------------------------------------------------------|
84
+ | F32 | Custom Llama 3.2 | FP32 | 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.F32.gguf) |
85
+ | F16 | Custom Llama 3.2 | FP16 | 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.F16.gguf) |
86
+ | Q4_0 | Custom Llama 3.2 | 4-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q4_0.gguf) |
87
+ | Q4_K_M | Custom Llama 3.2 | 4-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q4_K_M.gguf) |
88
+ | Q5_K_M | Custom Llama 3.2 | 5-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q5_K_M.gguf) |
89
+ | Q8_0 | Custom Llama 3.2 | 8-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-Nano-1B-Preview/resolve/main/iSA-02-Nano-1B-Preview.Q8_0.gguf) |
90
+
91
+ ### Hardware Requirements
92
+
93
+ | Version | Quantization | Size | Memory (RAM/vRAM) |
94
+ |---------|--------------|--------|-------------------|
95
+ | F32 | FP32 | 4.95 GB| 9.9 GB |
96
+ | F16 | FP16 | 2.48 GB| 4.96 GB |
97
+ | Q4_0 | 4-bit | 771 MB | 1.56 GB |
98
+ | Q4_K_M | 4-bit | 808 MB | 1.62 GB |
99
+ | Q5_K_M | 5-bit | 912 MB | 1.84 GB |
100
+ | Q8_0 | 8-bit | 1.32 GB| 2.64 GB |
101
+
102
+ ## Training and Fine-Tuning
103
+
104
+ Trained on synthetic datasets tailored to enhance logical reasoning, multi-step task execution, and contextual tool usage, the iSA-02 series ensures robust performance in complex scenarios and adaptive behaviors.
105
+
106
+ ## Use Cases
107
+
108
+ ### Applications
109
+ - **Logical Reasoning & Decision-Making:** Generate analytical reports from system logs.
110
+ - **Dynamic Tool Integration:** Ideal for long-context RAG tasks like querying large databases.
111
+ - **Structured Content Generation:** Perfect for correcting OCR outputs and filling in missing data.
112
+
113
+ ### Limitations
114
+ - **Unsuitable for:**
115
+ - High-throughput text generation.
116
+ - Latency-sensitive applications.
117
+ - **Challenges:**
118
+ - Potential biases from synthetic data.
119
+ - Redundant or verbose reasoning.
120
+
121
+ ## Improvements in Version 1.1
122
+
123
+ - **Enhanced Reasoning:** Faster processing with reduced overthinking.
124
+ - **Better Tool Utilization:** More effective use of external tools.
125
+ - **Improved Context Understanding:** Aligns actions with user intentions.
126
+ - **Reduced Redundancy:** More concise responses.
127
+ - **Less Task Aversion:** Fewer refusals of routine tasks.
128
+ - **Optimized Context Management:** Efficient handling of the 256K context window.
129
+
130
+ ## Best Practices
131
+
132
+ ### Configuration Recommendations
133
+ - **max_tokens:**
134
+ - **Simple Tasks:** 1,024–4,096 tokens
135
+ - **Complex Tasks:** 8,000–16,000 tokens
136
+ - **temperature:**
137
+ - **Objective Responses:** 0.1–0.3
138
+ - **Creative Reasoning:** 0.7–1.0
139
+ - **top_p:**
140
+ - **Focused Outputs:** 0.85
141
+ - **Precision Tasks:** 0.1
142
+ - **stop_sequences:**
143
+ - Use specific sequences like "Therefore, the answer is," to minimize redundancy.
144
+
145
+ ### Prompt Engineering
146
+ - **Simple Tasks:**
147
+ - **Example:** `"You are a helpful assistant."`
148
+ - **Complex Tasks:**
149
+ - **Example:** `"Transform OCR outputs into valid JSON, return only the JSON data as output."`
150
+ - **Structured Reasoning**: "Not apply in "No Structured Tags", as it is not necessary or supported."
151
+
152
+
153
+
154
+ ### Supervision and Monitoring
155
+ - **Clear Prompts:** Ensure instructions are specific and unambiguous to reduce errors and redundancies.
156
+
157
+ ## Known Issues (Addressed in V1.1)
158
+ - **Task Management:** Improved handling of complex tasks and function calls.
159
+ - **Unusual Behavior:** Reduced instances of unsolicited online searches or autonomous interactions.
160
+ - **Conversational Redirection:** Enhanced stability in maintaining topic focus.
161
+ - **Function Call Execution:** Ensured simulated function calls are actionable.
162
+
163
+ ## Citation
164
+
165
+ ```bibtex
166
+ @misc{isa02,
167
+ author = {NeuraLake},
168
+ title = {iSA-02: The First Small Reasoning Model with Context-Dynamic Behavior},
169
+ year = {2024},
170
+ license = {Apache 2.0},
171
+ url = {https://huggingface.co/NeuraLake/iSA-02},
172
+ }
173
+ ```
174
+
175
+ **Note:** This model card is under development and will be updated with additional details, evaluation metrics, and the final model name.
iSA-02-Nano-1B-NoTags.F16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1cdb9aa63c9ed49f7a38d3a8c21d1379cb091e893239bdaad6a150be3ecbf275
3
+ size 2479595776
iSA-02-Nano-1B-NoTags.F32.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4a7f386b3f45562d847116612629c07ae644c1bedc3aff1482b89cc25bee4730
3
+ size 4951089408
iSA-02-Nano-1B-NoTags.Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26272c3fbe7e61d266c0a2c0dc5b2d1e73f39ab4cbb75260bd66d147dec8ae27
3
+ size 770928896
iSA-02-Nano-1B-NoTags.Q4_1.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:feeeb4b883b2bbad87a8b0077fb5927e11e3a2f15ea5995010c3dcc2d62d3e79
3
+ size 831746304
iSA-02-Nano-1B-NoTags.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:539156d277f4b5985bae8ea5d5d02e89a2949678dc63016e84ff57862ff7e5c4
3
+ size 807694592
iSA-02-Nano-1B-NoTags.Q5_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a9e7a0b261a4c1450a0de9fc91cce3adf85d1d8fb3600d92cc5d81273ae6e00d
3
+ size 892563712
iSA-02-Nano-1B-NoTags.Q5_1.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:471633cac0d8aec23dab430c27d77bbd603daa37ff0912d681f16b558a42ff39
3
+ size 953381120
iSA-02-Nano-1B-NoTags.Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1ca81610883cf3cafc5b606fd3a15dac16b06840a2409f65b7a66a5969b76d10
3
+ size 911503616
iSA-02-Nano-1B-NoTags.Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1adb88b4cd6015909880d3b7ede8b3a9d782a767a26b7ffeceedb492cf66293
3
+ size 1021800704
iSA-02-Nano-1B-NoTags.Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e1c3583e9d7d600db2962f3aa060c90cf3ca576354b85cb307cdbcd2e6a813b
3
+ size 1321083136