davideuler commited on
Commit
805fa9c
Β·
1 Parent(s): 9482433

for Huggingface push

Browse files
Files changed (3) hide show
  1. README.md +98 -1
  2. main.py +211 -21
  3. requirements.txt +6 -0
README.md CHANGED
@@ -10,5 +10,102 @@ pinned: false
10
  license: mit
11
  short_description: Some small models chatbot
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
10
  license: mit
11
  short_description: Some small models chatbot
12
  ---
13
+ =======
14
+ # Multi-Model Tiny Chatbot
15
+
16
+ A lightweight, multi-model chat application featuring several small language models optimized for different tasks. Built with Gradio for an intuitive web interface and designed for local deployment.
17
+
18
+ ## 🌟 Features
19
+
20
+ - **Multiple Model Support**: Choose from 4 specialized small language models
21
+ - **Lazy Loading**: Models are loaded only when selected, optimizing memory usage
22
+ - **Real-time Chat Interface**: Smooth conversational experience with Gradio
23
+ - **Lightweight**: All models are under 200M parameters for fast inference
24
+ - **Local Deployment**: Run entirely on your local machine
25
+
26
+ ## πŸ€– Available Models
27
+
28
+ ### 1. SmolLM2 (135M Parameters)
29
+ - **Purpose**: General conversation and instruction following
30
+ - **Architecture**: HuggingFace SmolLM2-135M-Instruct
31
+ - **Best For**: General Q&A, creative writing, coding help
32
+ - **Language**: English
33
+
34
+ ### 2. NanoLM-25M (25M Parameters)
35
+ - **Purpose**: Ultra-lightweight instruction following
36
+ - **Architecture**: Mistral-based with chat template support
37
+ - **Best For**: Quick responses, simple tasks, resource-constrained environments
38
+ - **Language**: English
39
+
40
+ ### 3. NanoTranslator-S (9M Parameters)
41
+ - **Purpose**: English to Chinese translation
42
+ - **Architecture**: LLaMA-based translation model
43
+ - **Best For**: Translating English text to Chinese
44
+ - **Language**: English β†’ Chinese
45
+
46
+ ### 4. NanoTranslator-XL (78M Parameters)
47
+ - **Purpose**: Enhanced English to Chinese translation
48
+ - **Architecture**: LLaMA-based with improved accuracy
49
+ - **Best For**: High-quality English to Chinese translation
50
+ - **Language**: English β†’ Chinese
51
+
52
+ ## πŸš€ Quick Start
53
+
54
+ ### Prerequisites
55
+
56
+ - Python 3.8 or higher
57
+ - 4GB+ RAM recommended
58
+ - Internet connection for initial model downloads
59
+
60
+ ### Installation
61
+
62
+ 1. **Run the application**
63
+ ```bash
64
+ uv run main.py
65
+ ```
66
+
67
+ 2. **Open your browser**
68
+ - Navigate to `http://localhost:7860`
69
+ - Select a model and start chatting!
70
+
71
+
72
+ ## 🎯 Use Cases
73
+
74
+ ### General Conversation
75
+ - Use **SmolLM2** or **NanoLM-25M** for general chat, Q&A, and assistance
76
+
77
+ ### Translation Tasks
78
+ - Use **NanoTranslator-S** for quick English→Chinese translations
79
+ - Use **NanoTranslator-XL** for higher quality English→Chinese translations
80
+
81
+ ### Resource-Constrained Environments
82
+ - **NanoLM-25M** (25M params) for ultra-lightweight deployment
83
+ - **NanoTranslator-S** (9M params) for minimal translation needs
84
+
85
+ ## πŸ’‘ Model Performance
86
+
87
+ | Model | Parameters | Use Case | Memory Usage | Speed |
88
+ |-------|------------|----------|--------------|-------|
89
+ | SmolLM2 | 135M | General Chat | ~500MB | Fast |
90
+ | NanoLM-25M | 25M | Lightweight Chat | ~100MB | Very Fast |
91
+ | NanoTranslator-S | 9M | Quick Translation | ~50MB | Very Fast |
92
+ | NanoTranslator-XL | 78M | Quality Translation | ~300MB | Fast |
93
+
94
+
95
+
96
+ ### Model Sources
97
+ - SmolLM2: `HuggingFaceTB/SmolLM2-135M-Instruct`
98
+ - NanoLM-25M: `Mxode/NanoLM-25M-Instruct-v1.1`
99
+ - NanoTranslator-S: `Mxode/NanoTranslator-S`
100
+ - NanoTranslator-XL: `Mxode/NanoTranslator-XL`
101
+
102
+ ## πŸ“ License
103
+
104
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
105
+
106
+ ## πŸ™ Acknowledgments
107
+
108
+ - [HuggingFace](https://huggingface.co/) for the Transformers library and model hosting
109
+ - [Mxode](https://huggingface.co/Mxode) for the Nano series models
110
+ - [Gradio](https://gradio.app/) for the amazing web interface framework
111
 
 
main.py CHANGED
@@ -1,5 +1,5 @@
1
  import gradio as gr
2
- from transformers import AutoModelForCausalLM, AutoTokenizer, T5ForConditionalGeneration, T5Tokenizer
3
 
4
  class MultiModelChat:
5
  def __init__(self):
@@ -15,10 +15,20 @@ class MultiModelChat:
15
  'tokenizer': AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct"),
16
  'model': AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")
17
  }
18
- elif model_name == 'FLAN-T5':
19
- self.models['FLAN-T5'] = {
20
- 'tokenizer': T5Tokenizer.from_pretrained("google/flan-t5-small"),
21
- 'model': T5ForConditionalGeneration.from_pretrained("google/flan-t5-small")
 
 
 
 
 
 
 
 
 
 
22
  }
23
 
24
  # Set pad token for the newly loaded model
@@ -30,8 +40,12 @@ class MultiModelChat:
30
  def chat(self, message, history, model_choice):
31
  if model_choice == "SmolLM2":
32
  return self.chat_smol(message, history)
33
- elif model_choice == "FLAN-T5":
34
- return self.chat_flan(message, history)
 
 
 
 
35
 
36
  def chat_smol(self, message, history):
37
  self.ensure_model_loaded('SmolLM2')
@@ -50,15 +64,79 @@ class MultiModelChat:
50
  response = tokenizer.decode(outputs[0], skip_special_tokens=True)
51
  return response.split("Assistant:")[-1].strip()
52
 
53
- def chat_flan(self, message, history):
54
- self.ensure_model_loaded('FLAN-T5')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
- tokenizer = self.models['FLAN-T5']['tokenizer']
57
- model = self.models['FLAN-T5']['model']
58
 
59
- inputs = tokenizer(f"Answer the question: {message}", return_tensors="pt")
60
- outputs = model.generate(inputs.input_ids, max_length=100)
61
- return tokenizer.decode(outputs[0], skip_special_tokens=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
  chat_app = MultiModelChat()
64
 
@@ -66,18 +144,126 @@ def respond(message, history, model_choice):
66
  return chat_app.chat(message, history, model_choice)
67
 
68
  with gr.Blocks(theme="soft") as demo:
69
- gr.Markdown("# Multi-Model Tiny Chatbot")
 
70
 
71
  with gr.Row():
72
  model_dropdown = gr.Dropdown(
73
- choices=["SmolLM2", "FLAN-T5"],
74
- value="SmolLM2",
75
- label="Select Model"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
  )
77
 
78
- chatbot = gr.Chatbot(height=400)
79
- msg = gr.Textbox(label="Message", placeholder="Type your message here...")
80
- clear = gr.Button("Clear")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  def user_message(message, history):
83
  return "", history + [[message, None]]
@@ -88,9 +274,13 @@ with gr.Blocks(theme="soft") as demo:
88
  history[-1][1] = bot_response
89
  return history
90
 
 
91
  msg.submit(user_message, [msg, chatbot], [msg, chatbot]).then(
92
  bot_message, [chatbot, model_dropdown], chatbot
93
  )
 
 
 
94
  clear.click(lambda: None, None, chatbot, queue=False)
95
 
96
  demo.launch()
 
1
  import gradio as gr
2
+ from transformers import AutoModelForCausalLM, AutoTokenizer
3
 
4
  class MultiModelChat:
5
  def __init__(self):
 
15
  'tokenizer': AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct"),
16
  'model': AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")
17
  }
18
+ elif model_name == 'NanoLM-25M':
19
+ self.models['NanoLM-25M'] = {
20
+ 'tokenizer': AutoTokenizer.from_pretrained("Mxode/NanoLM-25M-Instruct-v1.1"),
21
+ 'model': AutoModelForCausalLM.from_pretrained("Mxode/NanoLM-25M-Instruct-v1.1")
22
+ }
23
+ elif model_name == 'NanoTranslator-S':
24
+ self.models['NanoTranslator-S'] = {
25
+ 'tokenizer': AutoTokenizer.from_pretrained("Mxode/NanoTranslator-S"),
26
+ 'model': AutoModelForCausalLM.from_pretrained("Mxode/NanoTranslator-S")
27
+ }
28
+ elif model_name == 'NanoTranslator-XL':
29
+ self.models['NanoTranslator-XL'] = {
30
+ 'tokenizer': AutoTokenizer.from_pretrained("Mxode/NanoTranslator-XL"),
31
+ 'model': AutoModelForCausalLM.from_pretrained("Mxode/NanoTranslator-XL")
32
  }
33
 
34
  # Set pad token for the newly loaded model
 
40
  def chat(self, message, history, model_choice):
41
  if model_choice == "SmolLM2":
42
  return self.chat_smol(message, history)
43
+ elif model_choice == "NanoLM-25M":
44
+ return self.chat_nanolm(message, history)
45
+ elif model_choice == "NanoTranslator-S":
46
+ return self.chat_translator(message, history)
47
+ elif model_choice == "NanoTranslator-XL":
48
+ return self.chat_translator_xl(message, history)
49
 
50
  def chat_smol(self, message, history):
51
  self.ensure_model_loaded('SmolLM2')
 
64
  response = tokenizer.decode(outputs[0], skip_special_tokens=True)
65
  return response.split("Assistant:")[-1].strip()
66
 
67
+ def chat_nanolm(self, message, history):
68
+ self.ensure_model_loaded('NanoLM-25M')
69
+
70
+ tokenizer = self.models['NanoLM-25M']['tokenizer']
71
+ model = self.models['NanoLM-25M']['model']
72
+
73
+ # Use chat template for NanoLM
74
+ messages = [
75
+ {"role": "system", "content": "You are a helpful assistant."},
76
+ {"role": "user", "content": message}
77
+ ]
78
+ text = tokenizer.apply_chat_template(
79
+ messages,
80
+ tokenize=False,
81
+ add_generation_prompt=True
82
+ )
83
+ inputs = tokenizer([text], return_tensors="pt")
84
+ outputs = model.generate(
85
+ inputs.input_ids,
86
+ max_new_tokens=100,
87
+ temperature=0.7,
88
+ do_sample=True,
89
+ pad_token_id=tokenizer.eos_token_id
90
+ )
91
+ generated_ids = [
92
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, outputs)
93
+ ]
94
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
95
+ return response
96
+
97
+ def chat_translator(self, message, history):
98
+ self.ensure_model_loaded('NanoTranslator-S')
99
 
100
+ tokenizer = self.models['NanoTranslator-S']['tokenizer']
101
+ model = self.models['NanoTranslator-S']['model']
102
 
103
+ # Use translation prompt format
104
+ prompt = f"<|im_start|>{message}<|endoftext|>"
105
+ inputs = tokenizer([prompt], return_tensors="pt")
106
+ outputs = model.generate(
107
+ inputs.input_ids,
108
+ max_new_tokens=100,
109
+ temperature=0.55,
110
+ do_sample=True,
111
+ pad_token_id=tokenizer.eos_token_id
112
+ )
113
+ generated_ids = [
114
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, outputs)
115
+ ]
116
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
117
+ return response
118
+
119
+ def chat_translator_xl(self, message, history):
120
+ self.ensure_model_loaded('NanoTranslator-XL')
121
+
122
+ tokenizer = self.models['NanoTranslator-XL']['tokenizer']
123
+ model = self.models['NanoTranslator-XL']['model']
124
+
125
+ # Use translation prompt format
126
+ prompt = f"<|im_start|>{message}<|endoftext|>"
127
+ inputs = tokenizer([prompt], return_tensors="pt")
128
+ outputs = model.generate(
129
+ inputs.input_ids,
130
+ max_new_tokens=100,
131
+ temperature=0.55,
132
+ do_sample=True,
133
+ pad_token_id=tokenizer.eos_token_id
134
+ )
135
+ generated_ids = [
136
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, outputs)
137
+ ]
138
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
139
+ return response
140
 
141
  chat_app = MultiModelChat()
142
 
 
144
  return chat_app.chat(message, history, model_choice)
145
 
146
  with gr.Blocks(theme="soft") as demo:
147
+ gr.Markdown("# πŸ€– Multi-Model Tiny Chatbot")
148
+ gr.Markdown("*Lightweight AI models for different tasks - Choose the right model for your needs!*")
149
 
150
  with gr.Row():
151
  model_dropdown = gr.Dropdown(
152
+ choices=["SmolLM2", "NanoLM-25M", "NanoTranslator-S", "NanoTranslator-XL"],
153
+ value="NanoLM-25M",
154
+ label="Select Model",
155
+ info="Choose the best model for your task"
156
+ )
157
+
158
+ # Model information display
159
+ with gr.Row():
160
+ model_info = gr.Markdown(
161
+ """
162
+ ## πŸ“‹ NanoLM-25M (25M) - Selected
163
+ **Best for:** Quick responses, simple tasks, resource-constrained environments
164
+ **Language:** English
165
+ **Memory:** ~100MB
166
+ **Speed:** Very Fast
167
+
168
+ πŸ’‘ **Tip:** Ultra-lightweight model perfect for fast responses!
169
+ """,
170
+ visible=True
171
  )
172
 
173
+ chatbot = gr.Chatbot(height=400, show_label=False)
174
+ msg = gr.Textbox(
175
+ label="Message",
176
+ placeholder="Type your message here...",
177
+ lines=2
178
+ )
179
+
180
+ with gr.Row():
181
+ clear = gr.Button("πŸ—‘οΈ Clear Chat", variant="secondary")
182
+ submit = gr.Button("πŸ’¬ Send", variant="primary")
183
+
184
+ # Usage tips
185
+ with gr.Accordion("πŸ“– Model Usage Guide", open=False):
186
+ gr.Markdown("""
187
+ ### 🎯 When to use each model:
188
+
189
+ **πŸ”΅ SmolLM2 (135M)**
190
+ - General conversations and questions
191
+ - Creative writing tasks
192
+ - Coding help and explanations
193
+ - Educational content
194
+
195
+ **🟒 NanoLM-25M (25M)**
196
+ - Quick responses when speed matters
197
+ - Resource-constrained environments
198
+ - Simple Q&A tasks
199
+ - Mobile or edge deployment
200
+
201
+ **πŸ”΄ NanoTranslator-S (9M)**
202
+ - Fast English β†’ Chinese translation
203
+ - Basic translation needs
204
+ - Ultra-low memory usage
205
+ - Real-time translation
206
+
207
+ **🟑 NanoTranslator-XL (78M)**
208
+ - High-quality English β†’ Chinese translation
209
+ - Professional translation work
210
+ - Complex sentences and idioms
211
+ - Better context understanding
212
+
213
+ ### πŸ’‘ Pro Tips:
214
+ - Models load automatically when first selected (lazy loading)
215
+ - Translation models work best with clear, complete sentences
216
+ - For translation, input English text and get Chinese output
217
+ - Restart the app to free up memory from unused models
218
+ """)
219
+
220
+ def update_model_info(model_choice):
221
+ info_map = {
222
+ "SmolLM2": """
223
+ ## πŸ“‹ SmolLM2 (135M) - Selected
224
+ **Best for:** General conversation, Q&A, creative writing, coding help
225
+ **Language:** English
226
+ **Memory:** ~500MB
227
+ **Speed:** Fast
228
+
229
+ πŸ’‘ **Tip:** Great all-around model for most conversational tasks!
230
+ """,
231
+ "NanoLM-25M": """
232
+ ## πŸ“‹ NanoLM-25M (25M) - Selected
233
+ **Best for:** Quick responses, simple tasks, resource-constrained environments
234
+ **Language:** English
235
+ **Memory:** ~100MB
236
+ **Speed:** Very Fast
237
+
238
+ πŸ’‘ **Tip:** Ultra-lightweight model perfect for fast responses!
239
+ """,
240
+ "NanoTranslator-S": """
241
+ ## πŸ“‹ NanoTranslator-S (9M) - Selected
242
+ **Best for:** Fast English β†’ Chinese translation
243
+ **Language:** English β†’ Chinese
244
+ **Memory:** ~50MB
245
+ **Speed:** Very Fast
246
+
247
+ πŸ’‘ **Tip:** Input English text to get Chinese translation. Great for quick translations!
248
+ """,
249
+ "NanoTranslator-XL": """
250
+ ## πŸ“‹ NanoTranslator-XL (78M) - Selected
251
+ **Best for:** High-quality English β†’ Chinese translation
252
+ **Language:** English β†’ Chinese
253
+ **Memory:** ~300MB
254
+ **Speed:** Fast
255
+
256
+ πŸ’‘ **Tip:** Best translation quality for complex sentences and professional use!
257
+ """
258
+ }
259
+ return info_map.get(model_choice, "")
260
+
261
+ # Update model info when dropdown changes
262
+ model_dropdown.change(
263
+ update_model_info,
264
+ inputs=[model_dropdown],
265
+ outputs=[model_info]
266
+ )
267
 
268
  def user_message(message, history):
269
  return "", history + [[message, None]]
 
274
  history[-1][1] = bot_response
275
  return history
276
 
277
+ # Handle message submission
278
  msg.submit(user_message, [msg, chatbot], [msg, chatbot]).then(
279
  bot_message, [chatbot, model_dropdown], chatbot
280
  )
281
+ submit.click(user_message, [msg, chatbot], [msg, chatbot]).then(
282
+ bot_message, [chatbot, model_dropdown], chatbot
283
+ )
284
  clear.click(lambda: None, None, chatbot, queue=False)
285
 
286
  demo.launch()
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ gradio>=4.0.0
2
+ transformers>=4.30.0
3
+ torch>=2.0.0
4
+ protobuf>=4.21.0
5
+ accelerate>=0.20.0
6
+ safetensors>=0.3.0