marialasam commited on
Commit
217f444
·
verified ·
1 Parent(s): 4e67133

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +109 -2
README.md CHANGED
@@ -1,3 +1,110 @@
 
 
 
 
1
  ---
2
- license: llama3
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LogiLlama
2
+
3
+ **LogiLlama** is a fine-tuned language model developed by Goppa AI. Built upon a 1B-parameter base from LLaMA, LogiLlama has been enhanced with injected knowledge and logical reasoning abilities. Our mission is to make smaller models smarter—delivering improved reasoning and problem-solving capabilities while maintaining a low memory footprint and energy efficiency for on-device applications.
4
+
5
  ---
6
+
7
+ ## Model Summary
8
+
9
+ While recent trends in language models have leaned towards scaling up parameters, LogiLlama demonstrates that “less can be more.” By fine-tuning a 1B parameter base model with advanced logical reasoning techniques, LogiLlama offers:
10
+
11
+ - **Enhanced Reasoning:** Improved logical thinking and knowledge integration for more accurate and context-aware responses.
12
+ - **Efficiency:** Designed for on-device processing with a low memory and energy footprint.
13
+ - **Transparency:** Our training process and configuration files are fully open-source, reflecting our commitment to transparent and reproducible research.
14
+
15
+ LogiLlama is the first step in our journey at Goppa AI to develop efficient, intelligent, and resource-friendly models that challenge the notion that bigger is always better.
16
+
17
+ ---
18
+
19
+ ## Model Description
20
+
21
+ - **Model Type:** Small Language Model (SLM) fine-tuned from a 1B parameter LLaMA base
22
+ - **Architecture:**
23
+ - Hidden Size: 2048
24
+ - Hidden Layers: 16
25
+ - Attention Heads: 32
26
+ - Intermediate Size: 8192
27
+ - Special Configuration: Incorporates a customized ROPE scaling (rope_type: "llama3")
28
+ - **Tokenization:**
29
+ - Custom tokenizer with an extensive set of special tokens (defined in `special_tokens_map.json` and `tokenizer_config.json`)
30
+ - **Language:** English
31
+ - **License:** MIT
32
+
33
+ ---
34
+
35
+ ## How to Use
36
+
37
+ Below is a sample code snippet demonstrating how to use LogiLlama with the Hugging Face Transformers library:
38
+
39
+ ```python
40
+ from transformers import AutoModelForCausalLM, AutoTokenizer
41
+
42
+ # Load tokenizer and model from our repository
43
+ tokenizer = AutoTokenizer.from_pretrained("GoppaAI/LogiLlama", trust_remote_code=True)
44
+ model = AutoModelForCausalLM.from_pretrained("GoppaAI/LogiLlama", trust_remote_code=True)
45
+
46
+ model.to('cuda')
47
+ text = "When faced with a complex problem, one must first analyze "
48
+ input_ids = tokenizer(text, return_tensors="pt").to('cuda').input_ids
49
+ outputs = model.generate(
50
+ input_ids,
51
+ max_length=1000,
52
+ temperature=0.6,
53
+ top_p=0.9,
54
+ repetition_penalty=1.2,
55
+ pad_token_id=tokenizer.eos_token_id
56
+ )
57
+ print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())
58
+ ```
59
+
60
+ ---
61
+
62
+ ## Configuration Files
63
+
64
+ The model repository includes several key configuration files that ensure LogiLlama functions seamlessly within the Hugging Face ecosystem:
65
+
66
+ - **config.json:** Contains the model architecture settings, including hidden size, number of layers, attention heads, and other hyperparameters.
67
+ - **generation_config.json:** Defines generation parameters such as temperature, top-p sampling, and end-of-sequence tokens.
68
+ - **special_tokens_map.json:** Maps special tokens (e.g., beginning-of-text, end-of-text, padding) used during tokenization.
69
+ - **tokenizer_config.json:** Provides metadata and settings for the tokenizer, ensuring consistency with the model’s vocabulary and special tokens.
70
+
71
+ ---
72
+
73
+ ## Training Details
74
+
75
+ LogiLlama was fine-tuned by injecting logical reasoning and domain-specific knowledge into a 1B parameter LLaMA base. By carefully curating training data and employing specialized techniques, we enhanced the model’s capability to handle reasoning tasks without significantly increasing its size. This project marks our commitment to advancing small, efficient models that do not compromise on performance.
76
+
77
+ ---
78
+
79
+ ## Citation
80
+
81
+ If you use LogiLlama in your research, please cite:
82
+
83
+ ```bibtex
84
+ @misc{goppa2025logillama,
85
+ title={LogiLlama: Injecting Logical Reasoning into Small Language Models},
86
+ author={Goppa AI},
87
+ year={2025},
88
+ note={https://github.com/GoppaAI/LogiLlama}
89
+ }
90
+ ```
91
+
92
+ ---
93
+
94
+ ## License
95
+
96
+ LogiLlama is released under the [MIT License](https://opensource.org/licenses/MIT).
97
+
98
+ ---
99
+
100
+ ## Inference & Deployment
101
+
102
+ - **Model Size:** 1B parameters
103
+ - **Tensor Type:** float32 (F32)
104
+ - **Deployment:** Optimized for on-device inference and resource-constrained environments. Currently available for local deployment; stay tuned for updates on hosted inference solutions.
105
+
106
+ ---
107
+
108
+ At Goppa AI, we are committed to pushing the boundaries of efficiency and intelligence in language models. LogiLlama is our first step towards creating small models that are not only resource-friendly but also smart enough to handle complex reasoning tasks. We invite you to explore, use, and contribute to this project on our [GitHub repository](https://github.com/GoppaAI/LogiLlama).
109
+
110
+ Happy innovating!