Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,72 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_1_alpaca
|
5 |
+
- AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_2_alpaca
|
6 |
+
- AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_3_alpaca
|
7 |
+
- AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_4_alpaca
|
8 |
+
- kejian/codesearchnet-python-raw
|
9 |
+
pipeline_tag: text-generation
|
10 |
+
---
|
11 |
+
|
12 |
+
# Model Card for MicroBOB-python
|
13 |
+
|
14 |
+
MicroBOB-python is a new, from scratch micro model based on RWKV x051a, which doesnt require a special kernel to train or inference. Developed and trained using a modified version of nanoRWKV.
|
15 |
+
## Model Details
|
16 |
+
|
17 |
+
### Model Description
|
18 |
+
|
19 |
+
MicroBOB-python is a new, from scratch micro model based on RWKV x051a, which doesnt require a special kernel to train or inference. Developed and trained using a modified version of nanoRWKV.
|
20 |
+
Base trained on 10's of thousands of lines of opensource and internal python code, and finetuned in 5 rounds using kejian/codesearchnet-python-raw, AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_1_alpaca and 3 others in the same series.
|
21 |
+
|
22 |
+
Developed for an in-house python code editor to act as a simple autocomplete, it's gotten smart enough for it's extreme small size (30 million parameters) I thought I should share. Model weights only licenced under MIT.
|
23 |
+
|
24 |
+
|
25 |
+
|
26 |
+
- **Developed by:** BalrogBob
|
27 |
+
- **Model type:** Custom implementation of RWKV x051a
|
28 |
+
- **License:** MIT (Model Weights)
|
29 |
+
- **Finetuned from model [optional]:** MicroBOB
|
30 |
+
|
31 |
+
## Uses
|
32 |
+
|
33 |
+
Simple autocompletion of python or python syntax like code.
|
34 |
+
|
35 |
+
### Direct Use
|
36 |
+
|
37 |
+
https://github.com/BlinkDL/nanoRWKV sample.py is sufficient to inference, and the training script included does work with the model weights, though at a slight loss of performance as far as training speed and memory usage during training, but should produce functionally identical results.
|
38 |
+
|
39 |
+
### Downstream Use
|
40 |
+
|
41 |
+
Code replacement and re-formatting - It is possible with a small amount of finetuning and clever python code to use the model to replace words, functions, and variables in python code.
|
42 |
+
|
43 |
+
### Out-of-Scope Use
|
44 |
+
|
45 |
+
HRLF Training can be used to instruct train the model with limited success. HRLF training blunts the models intelligence due to the limited availible parameters. The HRLF training replaces info from the dataset in the model weights.
|
46 |
+
|
47 |
+
## Bias, Risks, and Limitations
|
48 |
+
|
49 |
+
Model has no bias training or safety guardrails. It was trained on code from open web sources. The model may incidentally produce mallicious or insecure code. Use at your own risk! You are fully responsible for any generations produced using these model weights.
|
50 |
+
|
51 |
+
## How to Get Started with the Model
|
52 |
+
|
53 |
+
Clone https://github.com/BlinkDL/nanoRWKV. All the code contained therein is compatible with this model. While the code that generated the model is optimized and customized, the base nanoRWKV package can finetune the MicroBOB-python weights without issue at a reduced performance memory-wise.
|
54 |
+
|
55 |
+
## Training Details
|
56 |
+
|
57 |
+
### Training Data
|
58 |
+
|
59 |
+
https://huggingface.co/datasets/AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_1_alpaca
|
60 |
+
https://huggingface.co/datasets/AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_2_alpaca
|
61 |
+
https://huggingface.co/datasets/AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_3_alpaca
|
62 |
+
https://huggingface.co/datasets/AdapterOcean/python-code-instructions-18k-alpaca-standardized_cluster_4_alpaca
|
63 |
+
https://huggingface.co/datasets/kejian/codesearchnet-python-raw
|
64 |
+
My personal python code folder with 40+ projects and 30k lines of code
|
65 |
+
|
66 |
+
### Training Procedure
|
67 |
+
|
68 |
+
Standard nanoRWKV data prep, custom training loop.
|
69 |
+
|
70 |
+
#### Preprocessing [optional]
|
71 |
+
|
72 |
+
Tokenized all datasets with gpt2 encoding for simplicity. Version of MicroBOB with custom BPE encoder in development.
|