Update README.md
Browse files
README.md
CHANGED
@@ -22,12 +22,58 @@ This model is ideal for advanced NLP tasks, including ethical decision-making, m
|
|
22 |
|
23 |
**Model Overview**
|
24 |
|
25 |
-
|
26 |
-
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
**Usage**: Run on any web interface or as a bot for self-hosted solutions. Designed to run smoothly on CPU.
|
33 |
|
|
|
22 |
|
23 |
**Model Overview**
|
24 |
|
25 |
+
SpectraMind Models
|
26 |
+
A collection of fine-tuned Llama models optimized for CPU performance using the GGUF format. These models are designed for efficient inference with llama.cpp and other lightweight environments.
|
27 |
+
|
28 |
+
Model Directory
|
29 |
+
1. MicroSpectraMind (1B Model)
|
30 |
+
Base Model: Fine-tuned from Llama-3.2-1B.
|
31 |
+
Fine-Tuning Details: Explain the dataset used (e.g., domain-specific text, chat dialogues, or tasks such as summarization or question answering).
|
32 |
+
Optimization: Quantized to both:
|
33 |
+
f16: For maximum accuracy.
|
34 |
+
q8_0: For reduced size and faster inference.
|
35 |
+
Use Case: Ideal for lightweight applications such as embedded systems or single-threaded inference on CPUs.
|
36 |
+
File Sizes:
|
37 |
+
MicroSpectraMind_f16.gguf: 2.4 GB
|
38 |
+
MicroSpectraMind_q8.gguf: 1.3 GB
|
39 |
+
2. SpectraMind3 (3B Model)
|
40 |
+
Base Model: Fine-tuned from Llama-3.2-3B.
|
41 |
+
Fine-Tuning Details: Include key aspects of the fine-tuning, such as datasets or hyperparameters used, and what tasks it excels at.
|
42 |
+
Optimization:
|
43 |
+
f16: For higher accuracy.
|
44 |
+
q8_0: For better efficiency.
|
45 |
+
Use Case: Balances accuracy and performance, suited for general-purpose natural language tasks.
|
46 |
+
File Sizes:
|
47 |
+
SpectraMind3_f16.gguf: 4.7 GB
|
48 |
+
SpectraMind3_q8.gguf: 3.4 GB
|
49 |
+
3. SpectraMindZ (8B Model)
|
50 |
+
Base Model: Fine-tuned from Llama-3.2-8B.
|
51 |
+
Fine-Tuning Details: Provide specifics on dataset/task for fine-tuning.
|
52 |
+
Optimization:
|
53 |
+
f16: For maximum model precision.
|
54 |
+
q8_0: For efficient deployment with minimal performance impact.
|
55 |
+
Use Case: Best for complex tasks requiring higher reasoning or multitasking.
|
56 |
+
Expected File Sizes:
|
57 |
+
SpectraMindZ_f16.gguf: Approximately 12 GB
|
58 |
+
SpectraMindZ_q8.gguf: Approximately 8 GB
|
59 |
+
Optimization and Compatibility
|
60 |
+
All models are converted to GGUF format using llama.cpp, making them optimized for CPU-based inference. These models are ideal for systems with limited resources, such as desktops, laptops, and embedded devices.
|
61 |
+
Quantized versions (q8_0) are significantly smaller and faster, while maintaining reasonable accuracy.
|
62 |
+
How to Use
|
63 |
+
Download the GGUF Files: Use the provided links to download the .gguf files.
|
64 |
+
Run on llama.cpp: Example command for inference:
|
65 |
+
bash
|
66 |
+
Copy code
|
67 |
+
./main -m SpectraMind3_q8.gguf -p "Your prompt here"
|
68 |
+
Choose Quantization Based on Use Case:
|
69 |
+
Use f16 for maximum accuracy (e.g., research or high-precision tasks).
|
70 |
+
Use q8_0 for faster inference (e.g., real-time applications).
|
71 |
+
Model Comparison
|
72 |
+
Model Parameters f16 Size q8_0 Size Use Case
|
73 |
+
MicroSpectraMind 1B 2.4 GB 1.3 GB Lightweight, quick responses
|
74 |
+
SpectraMind3 3B 4.7 GB 3.4 GB Balanced accuracy/performance
|
75 |
+
SpectraMindZ 8B 12 GB 8 GB Advanced tasks, complex reasoning
|
76 |
+
|
77 |
|
78 |
**Usage**: Run on any web interface or as a bot for self-hosted solutions. Designed to run smoothly on CPU.
|
79 |
|