afrideva commited on
Commit
276fb4d
โ€ข
1 Parent(s): 165563b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +269 -0
README.md ADDED
@@ -0,0 +1,269 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BEE-spoke-data/smol_llama-101M-GQA-python
3
+ datasets:
4
+ - BEE-spoke-data/pypi_clean-deduped
5
+ inference: false
6
+ language:
7
+ - en
8
+ license: apache-2.0
9
+ metrics:
10
+ - accuracy
11
+ model_creator: BEE-spoke-data
12
+ model_name: smol_llama-101M-GQA-python
13
+ pipeline_tag: text-generation
14
+ quantized_by: afrideva
15
+ source_model: BEE-spoke-data/smol_llama-101M-GQA
16
+ tags:
17
+ - python
18
+ - codegen
19
+ - markdown
20
+ - smol_llama
21
+ - gguf
22
+ - ggml
23
+ - quantized
24
+ - q2_k
25
+ - q3_k_m
26
+ - q4_k_m
27
+ - q5_k_m
28
+ - q6_k
29
+ - q8_0
30
+ widget:
31
+ - example_title: Add Numbers Function
32
+ text: "def add_numbers(a, b):\n return\n"
33
+ - example_title: Car Class
34
+ text: "class Car:\n def __init__(self, make, model):\n self.make = make\n
35
+ \ self.model = model\n\n def display_car(self):\n"
36
+ - example_title: Pandas DataFrame
37
+ text: 'import pandas as pd
38
+
39
+ data = {''Name'': [''Tom'', ''Nick'', ''John''], ''Age'': [20, 21, 19]}
40
+
41
+ df = pd.DataFrame(data).convert_dtypes()
42
+
43
+ # eda
44
+
45
+ '
46
+ - example_title: Factorial Function
47
+ text: "def factorial(n):\n if n == 0:\n return 1\n else:\n"
48
+ - example_title: Fibonacci Function
49
+ text: "def fibonacci(n):\n if n <= 0:\n raise ValueError(\"Incorrect input\")\n
50
+ \ elif n == 1:\n return 0\n elif n == 2:\n return 1\n else:\n"
51
+ - example_title: Matplotlib Plot
52
+ text: 'import matplotlib.pyplot as plt
53
+
54
+ import numpy as np
55
+
56
+ x = np.linspace(0, 10, 100)
57
+
58
+ # simple plot
59
+
60
+ '
61
+ - example_title: Reverse String Function
62
+ text: "def reverse_string(s:str) -> str:\n return\n"
63
+ - example_title: Palindrome Function
64
+ text: "def is_palindrome(word:str) -> bool:\n return\n"
65
+ - example_title: Bubble Sort Function
66
+ text: "def bubble_sort(lst: list):\n n = len(lst)\n for i in range(n):\n for
67
+ j in range(0, n-i-1):\n"
68
+ - example_title: Binary Search Function
69
+ text: "def binary_search(arr, low, high, x):\n if high >= low:\n mid =
70
+ (high + low) // 2\n if arr[mid] == x:\n return mid\n elif
71
+ arr[mid] > x:\n"
72
+ ---
73
+ # BEE-spoke-data/smol_llama-101M-GQA-python-GGUF
74
+
75
+ Quantized GGUF model files for [smol_llama-101M-GQA-python](https://huggingface.co/BEE-spoke-data/smol_llama-101M-GQA-python) from [BEE-spoke-data](https://huggingface.co/BEE-spoke-data)
76
+
77
+
78
+ | Name | Quant method | Size |
79
+ | ---- | ---- | ---- |
80
+ | [smol_llama-101m-gqa-python.fp16.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.fp16.gguf) | fp16 | None |
81
+ | [smol_llama-101m-gqa-python.q2_k.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q2_k.gguf) | q2_k | None |
82
+ | [smol_llama-101m-gqa-python.q3_k_m.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q3_k_m.gguf) | q3_k_m | None |
83
+ | [smol_llama-101m-gqa-python.q4_k_m.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q4_k_m.gguf) | q4_k_m | None |
84
+ | [smol_llama-101m-gqa-python.q5_k_m.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q5_k_m.gguf) | q5_k_m | None |
85
+ | [smol_llama-101m-gqa-python.q6_k.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q6_k.gguf) | q6_k | None |
86
+ | [smol_llama-101m-gqa-python.q8_0.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q8_0.gguf) | q8_0 | None |
87
+
88
+
89
+
90
+ ## Original Model Card:
91
+ # smol_llama-101M-GQA: python
92
+
93
+ > 400MB of buzz: pure Python programming nectar! ๐Ÿฏ
94
+
95
+ This model is the general pre-trained checkpoint `BEE-spoke-data/smol_llama-101M-GQA` trained on a deduped version of `pypi` for +1 epoch. Play with the model in [this demo space](https://huggingface.co/spaces/BEE-spoke-data/beecoder-playground).
96
+
97
+ - Its architecture is the same as the base, with some new Python-related tokens added to vocab prior to training.
98
+ - It can generate basic Python code and markdown in README style, but will struggle with harder planning/reasoning tasks
99
+ - This is an experiment to test the abilities of smol-sized models in code generation; meaning **both** its capabilities and limitations
100
+
101
+ Use with care & understand that there may be some bugs ๐Ÿ› still to be worked out.
102
+
103
+ ## Usage
104
+
105
+ ๐Ÿ“Œ Be sure to note:
106
+
107
+ 1. The model uses the "slow" llama2 tokenizer. Set use_fast=False when loading the tokenizer.
108
+ 2. Use transformers library version 4.33.3 due to a known issue in version 4.34.1 (_at time of writing_)
109
+
110
+ > Which llama2 tokenizer the API widget uses is an age-old mystery, and may cause minor whitespace issues (widget only).
111
+
112
+ To install the necessary packages and load the model:
113
+
114
+ ```python
115
+ # Install necessary packages
116
+ # pip install transformers==4.33.3 accelerate sentencepiece
117
+
118
+ from transformers import AutoTokenizer, AutoModelForCausalLM
119
+
120
+ # Load the tokenizer and model
121
+ tokenizer = AutoTokenizer.from_pretrained(
122
+ "BEE-spoke-data/smol_llama-101M-GQA-python",
123
+ use_fast=False,
124
+ )
125
+ model = AutoModelForCausalLM.from_pretrained(
126
+ "BEE-spoke-data/smol_llama-101M-GQA-python",
127
+ device_map="auto",
128
+ )
129
+
130
+ # The model can now be used as any other decoder
131
+ ```
132
+
133
+ ### longer code-gen example
134
+
135
+
136
+ Below is a quick script that can be used as a reference/starting point for writing your own, better one :)
137
+
138
+
139
+
140
+ <details>
141
+ <summary>๐Ÿ”ฅ Unleash the Power of Code Generation! Click to Reveal the Magic! ๐Ÿ”ฎ</summary>
142
+
143
+ Are you ready to witness the incredible possibilities of code generation? ๐Ÿš€. Brace yourself for an exceptional journey into the world of artificial intelligence and programming. Observe a script that will change the way you create and finalize code.
144
+
145
+ This script provides entry to a planet where machines can write code with remarkable precision and imagination.
146
+
147
+ ```python
148
+ """
149
+ simple script for testing model(s) designed to generate/complete code
150
+
151
+ See details/args with the below.
152
+ python textgen_inference_code.py --help
153
+ """
154
+ import logging
155
+ import random
156
+ import time
157
+ from pathlib import Path
158
+
159
+ import fire
160
+ import torch
161
+ from transformers import AutoModelForCausalLM, AutoTokenizer
162
+
163
+ logging.basicConfig(format="%(levelname)s - %(message)s", level=logging.INFO)
164
+
165
+
166
+ class Timer:
167
+ """
168
+ Basic timer utility.
169
+ """
170
+
171
+ def __enter__(self):
172
+
173
+ self.start_time = time.perf_counter()
174
+ return self
175
+
176
+ def __exit__(self, exc_type, exc_value, traceback):
177
+
178
+ self.end_time = time.perf_counter()
179
+ self.elapsed_time = self.end_time - self.start_time
180
+ logging.info(f"Elapsed time: {self.elapsed_time:.4f} seconds")
181
+
182
+
183
+ def load_model(model_name, use_fast=False):
184
+ """ util for loading model and tokenizer"""
185
+ logging.info(f"Loading model: {model_name}")
186
+ tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=use_fast)
187
+ model = AutoModelForCausalLM.from_pretrained(
188
+ model_name, torch_dtype="auto", device_map="auto"
189
+ )
190
+ model = torch.compile(model)
191
+ return tokenizer, model
192
+
193
+
194
+ def run_inference(prompt, model, tokenizer, max_new_tokens: int = 256):
195
+ """
196
+ run_inference
197
+
198
+ Args:
199
+ prompt (TYPE): Description
200
+ model (TYPE): Description
201
+ tokenizer (TYPE): Description
202
+ max_new_tokens (int, optional): Description
203
+
204
+ Returns:
205
+ TYPE: Description
206
+ """
207
+ logging.info(f"Running inference with max_new_tokens={max_new_tokens} ...")
208
+ with Timer() as timer:
209
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
210
+ outputs = model.generate(
211
+ **inputs,
212
+ max_new_tokens=max_new_tokens,
213
+ min_new_tokens=8,
214
+ renormalize_logits=True,
215
+ no_repeat_ngram_size=8,
216
+ repetition_penalty=1.04,
217
+ num_beams=4,
218
+ early_stopping=True,
219
+ )
220
+ text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
221
+ logging.info(f"Output text:\n\n{text}")
222
+ return text
223
+
224
+
225
+ def main(
226
+ model_name="BEE-spoke-data/smol_llama-101M-GQA-python",
227
+ prompt:str=None,
228
+ use_fast=False,
229
+ n_tokens: int = 256,
230
+ ):
231
+ """Summary
232
+
233
+ Args:
234
+ model_name (str, optional): Description
235
+ prompt (None, optional): specify the prompt directly (default: random choice from list)
236
+ n_tokens (int, optional): max new tokens to generate
237
+ """
238
+ logging.info(f"Inference with:\t{model_name}, max_new_tokens:{n_tokens}")
239
+
240
+ if prompt is None:
241
+ prompt_list = [
242
+ '''
243
+ def print_primes(n: int):
244
+ """
245
+ Print all primes between 1 and n
246
+ """''',
247
+ "def quantum_analysis(",
248
+ "def sanitize_filenames(target_dir:str, recursive:False, extension",
249
+ ]
250
+ prompt = random.SystemRandom().choice(prompt_list)
251
+
252
+ logging.info(f"Using prompt:\t{prompt}")
253
+
254
+ tokenizer, model = load_model(model_name, use_fast=use_fast)
255
+
256
+ run_inference(prompt, model, tokenizer, n_tokens)
257
+
258
+
259
+ if __name__ == "__main__":
260
+ fire.Fire(main)
261
+ ```
262
+
263
+ Wowoweewa!! It can create some file cleaning utilities.
264
+
265
+
266
+ </details>
267
+
268
+
269
+ ---