sharpenb commited on
Commit
1a6903c
1 Parent(s): e113507

717b1b6eb6051b2ca72722be48ad37f0562895f3dbe0efa4216291a285059774

Browse files
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ thumbnail: "https://assets-global.website-files.com/646b351987a8d8ce158d1940/64ec9e96b4334c0e1ac41504_Logo%20with%20white%20text.svg"
3
+ base_model: nomic-ai/gpt4all-j
4
+ metrics:
5
+ - memory_disk
6
+ - memory_inference
7
+ - inference_latency
8
+ - inference_throughput
9
+ - inference_CO2_emissions
10
+ - inference_energy_consumption
11
+ tags:
12
+ - pruna-ai
13
+ ---
14
+ <!-- header start -->
15
+ <!-- 200823 -->
16
+ <div style="width: auto; margin-left: auto; margin-right: auto">
17
+ <a href="https://www.pruna.ai/" target="_blank" rel="noopener noreferrer">
18
+ <img src="https://i.imgur.com/eDAlcgk.png" alt="PrunaAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
19
+ </a>
20
+ </div>
21
+ <!-- header end -->
22
+
23
+ [![Twitter](https://img.shields.io/twitter/follow/PrunaAI?style=social)](https://twitter.com/PrunaAI)
24
+ [![GitHub](https://img.shields.io/github/followers/PrunaAI?label=Follow%20%40PrunaAI&style=social)](https://github.com/PrunaAI)
25
+ [![LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue)](https://www.linkedin.com/company/93832878/admin/feed/posts/?feedType=following)
26
+ [![Discord](https://img.shields.io/badge/Discord-Join%20Us-blue?style=social&logo=discord)](https://discord.gg/CP4VSgck)
27
+
28
+ # Simply make AI models cheaper, smaller, faster, and greener!
29
+
30
+ - Give a thumbs up if you like this model!
31
+ - Contact us and tell us which model to compress next [here](https://www.pruna.ai/contact).
32
+ - Request access to easily compress your *own* AI models [here](https://z0halsaff74.typeform.com/pruna-access?typeform-source=www.pruna.ai).
33
+ - Read the documentations to know more [here](https://pruna-ai-pruna.readthedocs-hosted.com/en/latest/)
34
+ - Join Pruna AI community on Discord [here](https://discord.gg/CP4VSgck) to share feedback/suggestions or get help.
35
+
36
+ ## Results
37
+
38
+ ![image info](./plots.png)
39
+
40
+ **Frequently Asked Questions**
41
+ - ***How does the compression work?*** The model is compressed with awq.
42
+ - ***How does the model quality change?*** The quality of the model output might vary compared to the base model.
43
+ - ***How is the model efficiency evaluated?*** These results were obtained on HARDWARE_NAME with configuration described in `model/smash_config.json` and are obtained after a hardware warmup. The smashed model is directly compared to the original base model. Efficiency results may vary in other settings (e.g. other hardware, image size, batch size, ...). We recommend to directly run them in the use-case conditions to know if the smashed model can benefit you.
44
+ - ***What is the model format?*** We use safetensors.
45
+ - ***What calibration data has been used?*** If needed by the compression method, we used WikiText as the calibration data.
46
+ - ***What is the naming convention for Pruna Huggingface models?*** We take the original model name and append "turbo", "tiny", or "green" if the smashed model has a measured inference speed, inference memory, or inference energy consumption which is less than 90% of the original base model.
47
+ - ***How to compress my own models?*** You can request premium access to more compression methods and tech support for your specific use-cases [here](https://z0halsaff74.typeform.com/pruna-access?typeform-source=www.pruna.ai).
48
+ - ***What are "first" metrics?*** Results mentioning "first" are obtained after the first run of the model. The first run might take more memory or be slower than the subsequent runs due cuda overheads.
49
+ - ***What are "Sync" and "Async" metrics?*** "Sync" metrics are obtained by syncing all GPU processes and stop measurement when all of them are executed. "Async" metrics are obtained without syncing all GPU processes and stop when the model output can be used by the CPU. We provide both metrics since both could be relevant depending on the use-case. We recommend to test the efficiency gains directly in your use-cases.
50
+
51
+ ## Setup
52
+
53
+ You can run the smashed model with these steps:
54
+
55
+ 0. Check requirements from the original repo nomic-ai/gpt4all-j installed. In particular, check python, cuda, and transformers versions.
56
+ 1. Make sure that you have installed quantization related packages.
57
+ ```bash
58
+ pip install autoawq
59
+ ```
60
+ 2. Load & run the model.
61
+ ```python
62
+ from transformers import AutoModelForCausalLM, AutoTokenizer
63
+ from awq import AutoAWQForCausalLM
64
+
65
+ model = AutoAWQForCausalLM.from_quantized("PrunaAI/nomic-ai-gpt4all-j-AWQ-4bit-smashed", trust_remote_code=True, device_map='auto')
66
+ tokenizer = AutoTokenizer.from_pretrained("nomic-ai/gpt4all-j")
67
+
68
+ input_ids = tokenizer("What is the color of prunes?,", return_tensors='pt').to(model.device)["input_ids"]
69
+
70
+ outputs = model.generate(input_ids, max_new_tokens=216)
71
+ tokenizer.decode(outputs[0])
72
+ ```
73
+
74
+ ## Configurations
75
+
76
+ The configuration info are in `smash_config.json`.
77
+
78
+ ## Credits & License
79
+
80
+ The license of the smashed model follows the license of the original model. Please check the license of the original model nomic-ai/gpt4all-j before using this model which provided the base model. The license of the `pruna-engine` is [here](https://pypi.org/project/pruna-engine/) on Pypi.
81
+
82
+ ## Want to compress other models?
83
+
84
+ - Contact us and tell us which model to compress next [here](https://www.pruna.ai/contact).
85
+ - Request access to easily compress your own AI models [here](https://z0halsaff74.typeform.com/pruna-access?typeform-source=www.pruna.ai).
added_tokens.json ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "<|extratoken_100|>": 50356,
3
+ "<|extratoken_101|>": 50357,
4
+ "<|extratoken_102|>": 50358,
5
+ "<|extratoken_103|>": 50359,
6
+ "<|extratoken_104|>": 50360,
7
+ "<|extratoken_105|>": 50361,
8
+ "<|extratoken_106|>": 50362,
9
+ "<|extratoken_107|>": 50363,
10
+ "<|extratoken_108|>": 50364,
11
+ "<|extratoken_109|>": 50365,
12
+ "<|extratoken_10|>": 50266,
13
+ "<|extratoken_110|>": 50366,
14
+ "<|extratoken_111|>": 50367,
15
+ "<|extratoken_112|>": 50368,
16
+ "<|extratoken_113|>": 50369,
17
+ "<|extratoken_114|>": 50370,
18
+ "<|extratoken_115|>": 50371,
19
+ "<|extratoken_116|>": 50372,
20
+ "<|extratoken_117|>": 50373,
21
+ "<|extratoken_118|>": 50374,
22
+ "<|extratoken_119|>": 50375,
23
+ "<|extratoken_11|>": 50267,
24
+ "<|extratoken_120|>": 50376,
25
+ "<|extratoken_121|>": 50377,
26
+ "<|extratoken_122|>": 50378,
27
+ "<|extratoken_123|>": 50379,
28
+ "<|extratoken_124|>": 50380,
29
+ "<|extratoken_125|>": 50381,
30
+ "<|extratoken_126|>": 50382,
31
+ "<|extratoken_127|>": 50383,
32
+ "<|extratoken_128|>": 50384,
33
+ "<|extratoken_129|>": 50385,
34
+ "<|extratoken_12|>": 50268,
35
+ "<|extratoken_130|>": 50386,
36
+ "<|extratoken_131|>": 50387,
37
+ "<|extratoken_132|>": 50388,
38
+ "<|extratoken_133|>": 50389,
39
+ "<|extratoken_134|>": 50390,
40
+ "<|extratoken_135|>": 50391,
41
+ "<|extratoken_136|>": 50392,
42
+ "<|extratoken_137|>": 50393,
43
+ "<|extratoken_138|>": 50394,
44
+ "<|extratoken_139|>": 50395,
45
+ "<|extratoken_13|>": 50269,
46
+ "<|extratoken_140|>": 50396,
47
+ "<|extratoken_141|>": 50397,
48
+ "<|extratoken_142|>": 50398,
49
+ "<|extratoken_143|>": 50399,
50
+ "<|extratoken_14|>": 50270,
51
+ "<|extratoken_15|>": 50271,
52
+ "<|extratoken_16|>": 50272,
53
+ "<|extratoken_17|>": 50273,
54
+ "<|extratoken_18|>": 50274,
55
+ "<|extratoken_19|>": 50275,
56
+ "<|extratoken_1|>": 50257,
57
+ "<|extratoken_20|>": 50276,
58
+ "<|extratoken_21|>": 50277,
59
+ "<|extratoken_22|>": 50278,
60
+ "<|extratoken_23|>": 50279,
61
+ "<|extratoken_24|>": 50280,
62
+ "<|extratoken_25|>": 50281,
63
+ "<|extratoken_26|>": 50282,
64
+ "<|extratoken_27|>": 50283,
65
+ "<|extratoken_28|>": 50284,
66
+ "<|extratoken_29|>": 50285,
67
+ "<|extratoken_2|>": 50258,
68
+ "<|extratoken_30|>": 50286,
69
+ "<|extratoken_31|>": 50287,
70
+ "<|extratoken_32|>": 50288,
71
+ "<|extratoken_33|>": 50289,
72
+ "<|extratoken_34|>": 50290,
73
+ "<|extratoken_35|>": 50291,
74
+ "<|extratoken_36|>": 50292,
75
+ "<|extratoken_37|>": 50293,
76
+ "<|extratoken_38|>": 50294,
77
+ "<|extratoken_39|>": 50295,
78
+ "<|extratoken_3|>": 50259,
79
+ "<|extratoken_40|>": 50296,
80
+ "<|extratoken_41|>": 50297,
81
+ "<|extratoken_42|>": 50298,
82
+ "<|extratoken_43|>": 50299,
83
+ "<|extratoken_44|>": 50300,
84
+ "<|extratoken_45|>": 50301,
85
+ "<|extratoken_46|>": 50302,
86
+ "<|extratoken_47|>": 50303,
87
+ "<|extratoken_48|>": 50304,
88
+ "<|extratoken_49|>": 50305,
89
+ "<|extratoken_4|>": 50260,
90
+ "<|extratoken_50|>": 50306,
91
+ "<|extratoken_51|>": 50307,
92
+ "<|extratoken_52|>": 50308,
93
+ "<|extratoken_53|>": 50309,
94
+ "<|extratoken_54|>": 50310,
95
+ "<|extratoken_55|>": 50311,
96
+ "<|extratoken_56|>": 50312,
97
+ "<|extratoken_57|>": 50313,
98
+ "<|extratoken_58|>": 50314,
99
+ "<|extratoken_59|>": 50315,
100
+ "<|extratoken_5|>": 50261,
101
+ "<|extratoken_60|>": 50316,
102
+ "<|extratoken_61|>": 50317,
103
+ "<|extratoken_62|>": 50318,
104
+ "<|extratoken_63|>": 50319,
105
+ "<|extratoken_64|>": 50320,
106
+ "<|extratoken_65|>": 50321,
107
+ "<|extratoken_66|>": 50322,
108
+ "<|extratoken_67|>": 50323,
109
+ "<|extratoken_68|>": 50324,
110
+ "<|extratoken_69|>": 50325,
111
+ "<|extratoken_6|>": 50262,
112
+ "<|extratoken_70|>": 50326,
113
+ "<|extratoken_71|>": 50327,
114
+ "<|extratoken_72|>": 50328,
115
+ "<|extratoken_73|>": 50329,
116
+ "<|extratoken_74|>": 50330,
117
+ "<|extratoken_75|>": 50331,
118
+ "<|extratoken_76|>": 50332,
119
+ "<|extratoken_77|>": 50333,
120
+ "<|extratoken_78|>": 50334,
121
+ "<|extratoken_79|>": 50335,
122
+ "<|extratoken_7|>": 50263,
123
+ "<|extratoken_80|>": 50336,
124
+ "<|extratoken_81|>": 50337,
125
+ "<|extratoken_82|>": 50338,
126
+ "<|extratoken_83|>": 50339,
127
+ "<|extratoken_84|>": 50340,
128
+ "<|extratoken_85|>": 50341,
129
+ "<|extratoken_86|>": 50342,
130
+ "<|extratoken_87|>": 50343,
131
+ "<|extratoken_88|>": 50344,
132
+ "<|extratoken_89|>": 50345,
133
+ "<|extratoken_8|>": 50264,
134
+ "<|extratoken_90|>": 50346,
135
+ "<|extratoken_91|>": 50347,
136
+ "<|extratoken_92|>": 50348,
137
+ "<|extratoken_93|>": 50349,
138
+ "<|extratoken_94|>": 50350,
139
+ "<|extratoken_95|>": 50351,
140
+ "<|extratoken_96|>": 50352,
141
+ "<|extratoken_97|>": 50353,
142
+ "<|extratoken_98|>": 50354,
143
+ "<|extratoken_99|>": 50355,
144
+ "<|extratoken_9|>": 50265
145
+ }
config.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/ceph/hdd/staff/charpent/.cache/modelsyyvar2mu4tpgy2hv",
3
+ "activation_function": "gelu_new",
4
+ "architectures": [
5
+ "GPTJForCausalLM"
6
+ ],
7
+ "attn_pdrop": 0.0,
8
+ "bos_token_id": 50256,
9
+ "embd_pdrop": 0.0,
10
+ "eos_token_id": 50256,
11
+ "gradient_checkpointing": false,
12
+ "initializer_range": 0.02,
13
+ "layer_norm_epsilon": 1e-05,
14
+ "model_type": "gptj",
15
+ "n_embd": 4096,
16
+ "n_head": 16,
17
+ "n_inner": null,
18
+ "n_layer": 28,
19
+ "n_positions": 2048,
20
+ "quantization_config": {
21
+ "bits": 4,
22
+ "group_size": 128,
23
+ "modules_to_not_convert": null,
24
+ "quant_method": "awq",
25
+ "version": "gemm",
26
+ "zero_point": true
27
+ },
28
+ "resid_pdrop": 0.0,
29
+ "rotary": true,
30
+ "rotary_dim": 64,
31
+ "scale_attn_weights": true,
32
+ "summary_activation": null,
33
+ "summary_first_dropout": 0.1,
34
+ "summary_proj_to_labels": true,
35
+ "summary_type": "cls_index",
36
+ "summary_use_proj": true,
37
+ "task_specific_params": {
38
+ "text-generation": {
39
+ "do_sample": true,
40
+ "max_length": 50,
41
+ "temperature": 1.0
42
+ }
43
+ },
44
+ "tie_word_embeddings": false,
45
+ "tokenizer_class": "GPT2Tokenizer",
46
+ "torch_dtype": "float16",
47
+ "transformers_version": "4.40.0",
48
+ "use_cache": false,
49
+ "vocab_size": 50400
50
+ }
generation_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 50256,
4
+ "do_sample": true,
5
+ "eos_token_id": 50256,
6
+ "transformers_version": "4.40.0",
7
+ "use_cache": false
8
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
smash_config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "api_key": null,
3
+ "verify_url": "http://johnrachwan.pythonanywhere.com",
4
+ "smash_config": {
5
+ "pruners": "None",
6
+ "pruning_ratio": 0.0,
7
+ "factorizers": "None",
8
+ "quantizers": "['awq']",
9
+ "weight_quantization_bits": 4,
10
+ "output_deviation": 0.005,
11
+ "compilers": "None",
12
+ "static_batch": true,
13
+ "static_shape": true,
14
+ "controlnet": "None",
15
+ "unet_dim": 4,
16
+ "device": "cuda",
17
+ "cache_dir": "/ceph/hdd/staff/charpent/.cache/modelsyyvar2mu",
18
+ "batch_size": 1,
19
+ "model_name": "nomic-ai/gpt4all-j",
20
+ "task": "text_text_generation",
21
+ "max_batch_size": 1,
22
+ "qtype_weight": "torch.qint8",
23
+ "qtype_activation": "torch.quint8",
24
+ "qobserver": "<class 'torch.ao.quantization.observer.MinMaxObserver'>",
25
+ "qscheme": "torch.per_tensor_symmetric",
26
+ "qconfig": "x86",
27
+ "group_size": 128,
28
+ "damp_percent": 0.1,
29
+ "save_load_fn": "hf-awq"
30
+ }
31
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|endoftext|>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|endoftext|>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "unk_token": {
17
+ "content": "<|endoftext|>",
18
+ "lstrip": false,
19
+ "normalized": true,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,1167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "50256": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": true,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "50257": {
14
+ "content": "<|extratoken_1|>",
15
+ "lstrip": false,
16
+ "normalized": true,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": false
20
+ },
21
+ "50258": {
22
+ "content": "<|extratoken_2|>",
23
+ "lstrip": false,
24
+ "normalized": true,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": false
28
+ },
29
+ "50259": {
30
+ "content": "<|extratoken_3|>",
31
+ "lstrip": false,
32
+ "normalized": true,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": false
36
+ },
37
+ "50260": {
38
+ "content": "<|extratoken_4|>",
39
+ "lstrip": false,
40
+ "normalized": true,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": false
44
+ },
45
+ "50261": {
46
+ "content": "<|extratoken_5|>",
47
+ "lstrip": false,
48
+ "normalized": true,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": false
52
+ },
53
+ "50262": {
54
+ "content": "<|extratoken_6|>",
55
+ "lstrip": false,
56
+ "normalized": true,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": false
60
+ },
61
+ "50263": {
62
+ "content": "<|extratoken_7|>",
63
+ "lstrip": false,
64
+ "normalized": true,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": false
68
+ },
69
+ "50264": {
70
+ "content": "<|extratoken_8|>",
71
+ "lstrip": false,
72
+ "normalized": true,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": false
76
+ },
77
+ "50265": {
78
+ "content": "<|extratoken_9|>",
79
+ "lstrip": false,
80
+ "normalized": true,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": false
84
+ },
85
+ "50266": {
86
+ "content": "<|extratoken_10|>",
87
+ "lstrip": false,
88
+ "normalized": true,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": false
92
+ },
93
+ "50267": {
94
+ "content": "<|extratoken_11|>",
95
+ "lstrip": false,
96
+ "normalized": true,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": false
100
+ },
101
+ "50268": {
102
+ "content": "<|extratoken_12|>",
103
+ "lstrip": false,
104
+ "normalized": true,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": false
108
+ },
109
+ "50269": {
110
+ "content": "<|extratoken_13|>",
111
+ "lstrip": false,
112
+ "normalized": true,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": false
116
+ },
117
+ "50270": {
118
+ "content": "<|extratoken_14|>",
119
+ "lstrip": false,
120
+ "normalized": true,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "50271": {
126
+ "content": "<|extratoken_15|>",
127
+ "lstrip": false,
128
+ "normalized": true,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "50272": {
134
+ "content": "<|extratoken_16|>",
135
+ "lstrip": false,
136
+ "normalized": true,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "50273": {
142
+ "content": "<|extratoken_17|>",
143
+ "lstrip": false,
144
+ "normalized": true,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "50274": {
150
+ "content": "<|extratoken_18|>",
151
+ "lstrip": false,
152
+ "normalized": true,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "50275": {
158
+ "content": "<|extratoken_19|>",
159
+ "lstrip": false,
160
+ "normalized": true,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "50276": {
166
+ "content": "<|extratoken_20|>",
167
+ "lstrip": false,
168
+ "normalized": true,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "50277": {
174
+ "content": "<|extratoken_21|>",
175
+ "lstrip": false,
176
+ "normalized": true,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ },
181
+ "50278": {
182
+ "content": "<|extratoken_22|>",
183
+ "lstrip": false,
184
+ "normalized": true,
185
+ "rstrip": false,
186
+ "single_word": false,
187
+ "special": false
188
+ },
189
+ "50279": {
190
+ "content": "<|extratoken_23|>",
191
+ "lstrip": false,
192
+ "normalized": true,
193
+ "rstrip": false,
194
+ "single_word": false,
195
+ "special": false
196
+ },
197
+ "50280": {
198
+ "content": "<|extratoken_24|>",
199
+ "lstrip": false,
200
+ "normalized": true,
201
+ "rstrip": false,
202
+ "single_word": false,
203
+ "special": false
204
+ },
205
+ "50281": {
206
+ "content": "<|extratoken_25|>",
207
+ "lstrip": false,
208
+ "normalized": true,
209
+ "rstrip": false,
210
+ "single_word": false,
211
+ "special": false
212
+ },
213
+ "50282": {
214
+ "content": "<|extratoken_26|>",
215
+ "lstrip": false,
216
+ "normalized": true,
217
+ "rstrip": false,
218
+ "single_word": false,
219
+ "special": false
220
+ },
221
+ "50283": {
222
+ "content": "<|extratoken_27|>",
223
+ "lstrip": false,
224
+ "normalized": true,
225
+ "rstrip": false,
226
+ "single_word": false,
227
+ "special": false
228
+ },
229
+ "50284": {
230
+ "content": "<|extratoken_28|>",
231
+ "lstrip": false,
232
+ "normalized": true,
233
+ "rstrip": false,
234
+ "single_word": false,
235
+ "special": false
236
+ },
237
+ "50285": {
238
+ "content": "<|extratoken_29|>",
239
+ "lstrip": false,
240
+ "normalized": true,
241
+ "rstrip": false,
242
+ "single_word": false,
243
+ "special": false
244
+ },
245
+ "50286": {
246
+ "content": "<|extratoken_30|>",
247
+ "lstrip": false,
248
+ "normalized": true,
249
+ "rstrip": false,
250
+ "single_word": false,
251
+ "special": false
252
+ },
253
+ "50287": {
254
+ "content": "<|extratoken_31|>",
255
+ "lstrip": false,
256
+ "normalized": true,
257
+ "rstrip": false,
258
+ "single_word": false,
259
+ "special": false
260
+ },
261
+ "50288": {
262
+ "content": "<|extratoken_32|>",
263
+ "lstrip": false,
264
+ "normalized": true,
265
+ "rstrip": false,
266
+ "single_word": false,
267
+ "special": false
268
+ },
269
+ "50289": {
270
+ "content": "<|extratoken_33|>",
271
+ "lstrip": false,
272
+ "normalized": true,
273
+ "rstrip": false,
274
+ "single_word": false,
275
+ "special": false
276
+ },
277
+ "50290": {
278
+ "content": "<|extratoken_34|>",
279
+ "lstrip": false,
280
+ "normalized": true,
281
+ "rstrip": false,
282
+ "single_word": false,
283
+ "special": false
284
+ },
285
+ "50291": {
286
+ "content": "<|extratoken_35|>",
287
+ "lstrip": false,
288
+ "normalized": true,
289
+ "rstrip": false,
290
+ "single_word": false,
291
+ "special": false
292
+ },
293
+ "50292": {
294
+ "content": "<|extratoken_36|>",
295
+ "lstrip": false,
296
+ "normalized": true,
297
+ "rstrip": false,
298
+ "single_word": false,
299
+ "special": false
300
+ },
301
+ "50293": {
302
+ "content": "<|extratoken_37|>",
303
+ "lstrip": false,
304
+ "normalized": true,
305
+ "rstrip": false,
306
+ "single_word": false,
307
+ "special": false
308
+ },
309
+ "50294": {
310
+ "content": "<|extratoken_38|>",
311
+ "lstrip": false,
312
+ "normalized": true,
313
+ "rstrip": false,
314
+ "single_word": false,
315
+ "special": false
316
+ },
317
+ "50295": {
318
+ "content": "<|extratoken_39|>",
319
+ "lstrip": false,
320
+ "normalized": true,
321
+ "rstrip": false,
322
+ "single_word": false,
323
+ "special": false
324
+ },
325
+ "50296": {
326
+ "content": "<|extratoken_40|>",
327
+ "lstrip": false,
328
+ "normalized": true,
329
+ "rstrip": false,
330
+ "single_word": false,
331
+ "special": false
332
+ },
333
+ "50297": {
334
+ "content": "<|extratoken_41|>",
335
+ "lstrip": false,
336
+ "normalized": true,
337
+ "rstrip": false,
338
+ "single_word": false,
339
+ "special": false
340
+ },
341
+ "50298": {
342
+ "content": "<|extratoken_42|>",
343
+ "lstrip": false,
344
+ "normalized": true,
345
+ "rstrip": false,
346
+ "single_word": false,
347
+ "special": false
348
+ },
349
+ "50299": {
350
+ "content": "<|extratoken_43|>",
351
+ "lstrip": false,
352
+ "normalized": true,
353
+ "rstrip": false,
354
+ "single_word": false,
355
+ "special": false
356
+ },
357
+ "50300": {
358
+ "content": "<|extratoken_44|>",
359
+ "lstrip": false,
360
+ "normalized": true,
361
+ "rstrip": false,
362
+ "single_word": false,
363
+ "special": false
364
+ },
365
+ "50301": {
366
+ "content": "<|extratoken_45|>",
367
+ "lstrip": false,
368
+ "normalized": true,
369
+ "rstrip": false,
370
+ "single_word": false,
371
+ "special": false
372
+ },
373
+ "50302": {
374
+ "content": "<|extratoken_46|>",
375
+ "lstrip": false,
376
+ "normalized": true,
377
+ "rstrip": false,
378
+ "single_word": false,
379
+ "special": false
380
+ },
381
+ "50303": {
382
+ "content": "<|extratoken_47|>",
383
+ "lstrip": false,
384
+ "normalized": true,
385
+ "rstrip": false,
386
+ "single_word": false,
387
+ "special": false
388
+ },
389
+ "50304": {
390
+ "content": "<|extratoken_48|>",
391
+ "lstrip": false,
392
+ "normalized": true,
393
+ "rstrip": false,
394
+ "single_word": false,
395
+ "special": false
396
+ },
397
+ "50305": {
398
+ "content": "<|extratoken_49|>",
399
+ "lstrip": false,
400
+ "normalized": true,
401
+ "rstrip": false,
402
+ "single_word": false,
403
+ "special": false
404
+ },
405
+ "50306": {
406
+ "content": "<|extratoken_50|>",
407
+ "lstrip": false,
408
+ "normalized": true,
409
+ "rstrip": false,
410
+ "single_word": false,
411
+ "special": false
412
+ },
413
+ "50307": {
414
+ "content": "<|extratoken_51|>",
415
+ "lstrip": false,
416
+ "normalized": true,
417
+ "rstrip": false,
418
+ "single_word": false,
419
+ "special": false
420
+ },
421
+ "50308": {
422
+ "content": "<|extratoken_52|>",
423
+ "lstrip": false,
424
+ "normalized": true,
425
+ "rstrip": false,
426
+ "single_word": false,
427
+ "special": false
428
+ },
429
+ "50309": {
430
+ "content": "<|extratoken_53|>",
431
+ "lstrip": false,
432
+ "normalized": true,
433
+ "rstrip": false,
434
+ "single_word": false,
435
+ "special": false
436
+ },
437
+ "50310": {
438
+ "content": "<|extratoken_54|>",
439
+ "lstrip": false,
440
+ "normalized": true,
441
+ "rstrip": false,
442
+ "single_word": false,
443
+ "special": false
444
+ },
445
+ "50311": {
446
+ "content": "<|extratoken_55|>",
447
+ "lstrip": false,
448
+ "normalized": true,
449
+ "rstrip": false,
450
+ "single_word": false,
451
+ "special": false
452
+ },
453
+ "50312": {
454
+ "content": "<|extratoken_56|>",
455
+ "lstrip": false,
456
+ "normalized": true,
457
+ "rstrip": false,
458
+ "single_word": false,
459
+ "special": false
460
+ },
461
+ "50313": {
462
+ "content": "<|extratoken_57|>",
463
+ "lstrip": false,
464
+ "normalized": true,
465
+ "rstrip": false,
466
+ "single_word": false,
467
+ "special": false
468
+ },
469
+ "50314": {
470
+ "content": "<|extratoken_58|>",
471
+ "lstrip": false,
472
+ "normalized": true,
473
+ "rstrip": false,
474
+ "single_word": false,
475
+ "special": false
476
+ },
477
+ "50315": {
478
+ "content": "<|extratoken_59|>",
479
+ "lstrip": false,
480
+ "normalized": true,
481
+ "rstrip": false,
482
+ "single_word": false,
483
+ "special": false
484
+ },
485
+ "50316": {
486
+ "content": "<|extratoken_60|>",
487
+ "lstrip": false,
488
+ "normalized": true,
489
+ "rstrip": false,
490
+ "single_word": false,
491
+ "special": false
492
+ },
493
+ "50317": {
494
+ "content": "<|extratoken_61|>",
495
+ "lstrip": false,
496
+ "normalized": true,
497
+ "rstrip": false,
498
+ "single_word": false,
499
+ "special": false
500
+ },
501
+ "50318": {
502
+ "content": "<|extratoken_62|>",
503
+ "lstrip": false,
504
+ "normalized": true,
505
+ "rstrip": false,
506
+ "single_word": false,
507
+ "special": false
508
+ },
509
+ "50319": {
510
+ "content": "<|extratoken_63|>",
511
+ "lstrip": false,
512
+ "normalized": true,
513
+ "rstrip": false,
514
+ "single_word": false,
515
+ "special": false
516
+ },
517
+ "50320": {
518
+ "content": "<|extratoken_64|>",
519
+ "lstrip": false,
520
+ "normalized": true,
521
+ "rstrip": false,
522
+ "single_word": false,
523
+ "special": false
524
+ },
525
+ "50321": {
526
+ "content": "<|extratoken_65|>",
527
+ "lstrip": false,
528
+ "normalized": true,
529
+ "rstrip": false,
530
+ "single_word": false,
531
+ "special": false
532
+ },
533
+ "50322": {
534
+ "content": "<|extratoken_66|>",
535
+ "lstrip": false,
536
+ "normalized": true,
537
+ "rstrip": false,
538
+ "single_word": false,
539
+ "special": false
540
+ },
541
+ "50323": {
542
+ "content": "<|extratoken_67|>",
543
+ "lstrip": false,
544
+ "normalized": true,
545
+ "rstrip": false,
546
+ "single_word": false,
547
+ "special": false
548
+ },
549
+ "50324": {
550
+ "content": "<|extratoken_68|>",
551
+ "lstrip": false,
552
+ "normalized": true,
553
+ "rstrip": false,
554
+ "single_word": false,
555
+ "special": false
556
+ },
557
+ "50325": {
558
+ "content": "<|extratoken_69|>",
559
+ "lstrip": false,
560
+ "normalized": true,
561
+ "rstrip": false,
562
+ "single_word": false,
563
+ "special": false
564
+ },
565
+ "50326": {
566
+ "content": "<|extratoken_70|>",
567
+ "lstrip": false,
568
+ "normalized": true,
569
+ "rstrip": false,
570
+ "single_word": false,
571
+ "special": false
572
+ },
573
+ "50327": {
574
+ "content": "<|extratoken_71|>",
575
+ "lstrip": false,
576
+ "normalized": true,
577
+ "rstrip": false,
578
+ "single_word": false,
579
+ "special": false
580
+ },
581
+ "50328": {
582
+ "content": "<|extratoken_72|>",
583
+ "lstrip": false,
584
+ "normalized": true,
585
+ "rstrip": false,
586
+ "single_word": false,
587
+ "special": false
588
+ },
589
+ "50329": {
590
+ "content": "<|extratoken_73|>",
591
+ "lstrip": false,
592
+ "normalized": true,
593
+ "rstrip": false,
594
+ "single_word": false,
595
+ "special": false
596
+ },
597
+ "50330": {
598
+ "content": "<|extratoken_74|>",
599
+ "lstrip": false,
600
+ "normalized": true,
601
+ "rstrip": false,
602
+ "single_word": false,
603
+ "special": false
604
+ },
605
+ "50331": {
606
+ "content": "<|extratoken_75|>",
607
+ "lstrip": false,
608
+ "normalized": true,
609
+ "rstrip": false,
610
+ "single_word": false,
611
+ "special": false
612
+ },
613
+ "50332": {
614
+ "content": "<|extratoken_76|>",
615
+ "lstrip": false,
616
+ "normalized": true,
617
+ "rstrip": false,
618
+ "single_word": false,
619
+ "special": false
620
+ },
621
+ "50333": {
622
+ "content": "<|extratoken_77|>",
623
+ "lstrip": false,
624
+ "normalized": true,
625
+ "rstrip": false,
626
+ "single_word": false,
627
+ "special": false
628
+ },
629
+ "50334": {
630
+ "content": "<|extratoken_78|>",
631
+ "lstrip": false,
632
+ "normalized": true,
633
+ "rstrip": false,
634
+ "single_word": false,
635
+ "special": false
636
+ },
637
+ "50335": {
638
+ "content": "<|extratoken_79|>",
639
+ "lstrip": false,
640
+ "normalized": true,
641
+ "rstrip": false,
642
+ "single_word": false,
643
+ "special": false
644
+ },
645
+ "50336": {
646
+ "content": "<|extratoken_80|>",
647
+ "lstrip": false,
648
+ "normalized": true,
649
+ "rstrip": false,
650
+ "single_word": false,
651
+ "special": false
652
+ },
653
+ "50337": {
654
+ "content": "<|extratoken_81|>",
655
+ "lstrip": false,
656
+ "normalized": true,
657
+ "rstrip": false,
658
+ "single_word": false,
659
+ "special": false
660
+ },
661
+ "50338": {
662
+ "content": "<|extratoken_82|>",
663
+ "lstrip": false,
664
+ "normalized": true,
665
+ "rstrip": false,
666
+ "single_word": false,
667
+ "special": false
668
+ },
669
+ "50339": {
670
+ "content": "<|extratoken_83|>",
671
+ "lstrip": false,
672
+ "normalized": true,
673
+ "rstrip": false,
674
+ "single_word": false,
675
+ "special": false
676
+ },
677
+ "50340": {
678
+ "content": "<|extratoken_84|>",
679
+ "lstrip": false,
680
+ "normalized": true,
681
+ "rstrip": false,
682
+ "single_word": false,
683
+ "special": false
684
+ },
685
+ "50341": {
686
+ "content": "<|extratoken_85|>",
687
+ "lstrip": false,
688
+ "normalized": true,
689
+ "rstrip": false,
690
+ "single_word": false,
691
+ "special": false
692
+ },
693
+ "50342": {
694
+ "content": "<|extratoken_86|>",
695
+ "lstrip": false,
696
+ "normalized": true,
697
+ "rstrip": false,
698
+ "single_word": false,
699
+ "special": false
700
+ },
701
+ "50343": {
702
+ "content": "<|extratoken_87|>",
703
+ "lstrip": false,
704
+ "normalized": true,
705
+ "rstrip": false,
706
+ "single_word": false,
707
+ "special": false
708
+ },
709
+ "50344": {
710
+ "content": "<|extratoken_88|>",
711
+ "lstrip": false,
712
+ "normalized": true,
713
+ "rstrip": false,
714
+ "single_word": false,
715
+ "special": false
716
+ },
717
+ "50345": {
718
+ "content": "<|extratoken_89|>",
719
+ "lstrip": false,
720
+ "normalized": true,
721
+ "rstrip": false,
722
+ "single_word": false,
723
+ "special": false
724
+ },
725
+ "50346": {
726
+ "content": "<|extratoken_90|>",
727
+ "lstrip": false,
728
+ "normalized": true,
729
+ "rstrip": false,
730
+ "single_word": false,
731
+ "special": false
732
+ },
733
+ "50347": {
734
+ "content": "<|extratoken_91|>",
735
+ "lstrip": false,
736
+ "normalized": true,
737
+ "rstrip": false,
738
+ "single_word": false,
739
+ "special": false
740
+ },
741
+ "50348": {
742
+ "content": "<|extratoken_92|>",
743
+ "lstrip": false,
744
+ "normalized": true,
745
+ "rstrip": false,
746
+ "single_word": false,
747
+ "special": false
748
+ },
749
+ "50349": {
750
+ "content": "<|extratoken_93|>",
751
+ "lstrip": false,
752
+ "normalized": true,
753
+ "rstrip": false,
754
+ "single_word": false,
755
+ "special": false
756
+ },
757
+ "50350": {
758
+ "content": "<|extratoken_94|>",
759
+ "lstrip": false,
760
+ "normalized": true,
761
+ "rstrip": false,
762
+ "single_word": false,
763
+ "special": false
764
+ },
765
+ "50351": {
766
+ "content": "<|extratoken_95|>",
767
+ "lstrip": false,
768
+ "normalized": true,
769
+ "rstrip": false,
770
+ "single_word": false,
771
+ "special": false
772
+ },
773
+ "50352": {
774
+ "content": "<|extratoken_96|>",
775
+ "lstrip": false,
776
+ "normalized": true,
777
+ "rstrip": false,
778
+ "single_word": false,
779
+ "special": false
780
+ },
781
+ "50353": {
782
+ "content": "<|extratoken_97|>",
783
+ "lstrip": false,
784
+ "normalized": true,
785
+ "rstrip": false,
786
+ "single_word": false,
787
+ "special": false
788
+ },
789
+ "50354": {
790
+ "content": "<|extratoken_98|>",
791
+ "lstrip": false,
792
+ "normalized": true,
793
+ "rstrip": false,
794
+ "single_word": false,
795
+ "special": false
796
+ },
797
+ "50355": {
798
+ "content": "<|extratoken_99|>",
799
+ "lstrip": false,
800
+ "normalized": true,
801
+ "rstrip": false,
802
+ "single_word": false,
803
+ "special": false
804
+ },
805
+ "50356": {
806
+ "content": "<|extratoken_100|>",
807
+ "lstrip": false,
808
+ "normalized": true,
809
+ "rstrip": false,
810
+ "single_word": false,
811
+ "special": false
812
+ },
813
+ "50357": {
814
+ "content": "<|extratoken_101|>",
815
+ "lstrip": false,
816
+ "normalized": true,
817
+ "rstrip": false,
818
+ "single_word": false,
819
+ "special": false
820
+ },
821
+ "50358": {
822
+ "content": "<|extratoken_102|>",
823
+ "lstrip": false,
824
+ "normalized": true,
825
+ "rstrip": false,
826
+ "single_word": false,
827
+ "special": false
828
+ },
829
+ "50359": {
830
+ "content": "<|extratoken_103|>",
831
+ "lstrip": false,
832
+ "normalized": true,
833
+ "rstrip": false,
834
+ "single_word": false,
835
+ "special": false
836
+ },
837
+ "50360": {
838
+ "content": "<|extratoken_104|>",
839
+ "lstrip": false,
840
+ "normalized": true,
841
+ "rstrip": false,
842
+ "single_word": false,
843
+ "special": false
844
+ },
845
+ "50361": {
846
+ "content": "<|extratoken_105|>",
847
+ "lstrip": false,
848
+ "normalized": true,
849
+ "rstrip": false,
850
+ "single_word": false,
851
+ "special": false
852
+ },
853
+ "50362": {
854
+ "content": "<|extratoken_106|>",
855
+ "lstrip": false,
856
+ "normalized": true,
857
+ "rstrip": false,
858
+ "single_word": false,
859
+ "special": false
860
+ },
861
+ "50363": {
862
+ "content": "<|extratoken_107|>",
863
+ "lstrip": false,
864
+ "normalized": true,
865
+ "rstrip": false,
866
+ "single_word": false,
867
+ "special": false
868
+ },
869
+ "50364": {
870
+ "content": "<|extratoken_108|>",
871
+ "lstrip": false,
872
+ "normalized": true,
873
+ "rstrip": false,
874
+ "single_word": false,
875
+ "special": false
876
+ },
877
+ "50365": {
878
+ "content": "<|extratoken_109|>",
879
+ "lstrip": false,
880
+ "normalized": true,
881
+ "rstrip": false,
882
+ "single_word": false,
883
+ "special": false
884
+ },
885
+ "50366": {
886
+ "content": "<|extratoken_110|>",
887
+ "lstrip": false,
888
+ "normalized": true,
889
+ "rstrip": false,
890
+ "single_word": false,
891
+ "special": false
892
+ },
893
+ "50367": {
894
+ "content": "<|extratoken_111|>",
895
+ "lstrip": false,
896
+ "normalized": true,
897
+ "rstrip": false,
898
+ "single_word": false,
899
+ "special": false
900
+ },
901
+ "50368": {
902
+ "content": "<|extratoken_112|>",
903
+ "lstrip": false,
904
+ "normalized": true,
905
+ "rstrip": false,
906
+ "single_word": false,
907
+ "special": false
908
+ },
909
+ "50369": {
910
+ "content": "<|extratoken_113|>",
911
+ "lstrip": false,
912
+ "normalized": true,
913
+ "rstrip": false,
914
+ "single_word": false,
915
+ "special": false
916
+ },
917
+ "50370": {
918
+ "content": "<|extratoken_114|>",
919
+ "lstrip": false,
920
+ "normalized": true,
921
+ "rstrip": false,
922
+ "single_word": false,
923
+ "special": false
924
+ },
925
+ "50371": {
926
+ "content": "<|extratoken_115|>",
927
+ "lstrip": false,
928
+ "normalized": true,
929
+ "rstrip": false,
930
+ "single_word": false,
931
+ "special": false
932
+ },
933
+ "50372": {
934
+ "content": "<|extratoken_116|>",
935
+ "lstrip": false,
936
+ "normalized": true,
937
+ "rstrip": false,
938
+ "single_word": false,
939
+ "special": false
940
+ },
941
+ "50373": {
942
+ "content": "<|extratoken_117|>",
943
+ "lstrip": false,
944
+ "normalized": true,
945
+ "rstrip": false,
946
+ "single_word": false,
947
+ "special": false
948
+ },
949
+ "50374": {
950
+ "content": "<|extratoken_118|>",
951
+ "lstrip": false,
952
+ "normalized": true,
953
+ "rstrip": false,
954
+ "single_word": false,
955
+ "special": false
956
+ },
957
+ "50375": {
958
+ "content": "<|extratoken_119|>",
959
+ "lstrip": false,
960
+ "normalized": true,
961
+ "rstrip": false,
962
+ "single_word": false,
963
+ "special": false
964
+ },
965
+ "50376": {
966
+ "content": "<|extratoken_120|>",
967
+ "lstrip": false,
968
+ "normalized": true,
969
+ "rstrip": false,
970
+ "single_word": false,
971
+ "special": false
972
+ },
973
+ "50377": {
974
+ "content": "<|extratoken_121|>",
975
+ "lstrip": false,
976
+ "normalized": true,
977
+ "rstrip": false,
978
+ "single_word": false,
979
+ "special": false
980
+ },
981
+ "50378": {
982
+ "content": "<|extratoken_122|>",
983
+ "lstrip": false,
984
+ "normalized": true,
985
+ "rstrip": false,
986
+ "single_word": false,
987
+ "special": false
988
+ },
989
+ "50379": {
990
+ "content": "<|extratoken_123|>",
991
+ "lstrip": false,
992
+ "normalized": true,
993
+ "rstrip": false,
994
+ "single_word": false,
995
+ "special": false
996
+ },
997
+ "50380": {
998
+ "content": "<|extratoken_124|>",
999
+ "lstrip": false,
1000
+ "normalized": true,
1001
+ "rstrip": false,
1002
+ "single_word": false,
1003
+ "special": false
1004
+ },
1005
+ "50381": {
1006
+ "content": "<|extratoken_125|>",
1007
+ "lstrip": false,
1008
+ "normalized": true,
1009
+ "rstrip": false,
1010
+ "single_word": false,
1011
+ "special": false
1012
+ },
1013
+ "50382": {
1014
+ "content": "<|extratoken_126|>",
1015
+ "lstrip": false,
1016
+ "normalized": true,
1017
+ "rstrip": false,
1018
+ "single_word": false,
1019
+ "special": false
1020
+ },
1021
+ "50383": {
1022
+ "content": "<|extratoken_127|>",
1023
+ "lstrip": false,
1024
+ "normalized": true,
1025
+ "rstrip": false,
1026
+ "single_word": false,
1027
+ "special": false
1028
+ },
1029
+ "50384": {
1030
+ "content": "<|extratoken_128|>",
1031
+ "lstrip": false,
1032
+ "normalized": true,
1033
+ "rstrip": false,
1034
+ "single_word": false,
1035
+ "special": false
1036
+ },
1037
+ "50385": {
1038
+ "content": "<|extratoken_129|>",
1039
+ "lstrip": false,
1040
+ "normalized": true,
1041
+ "rstrip": false,
1042
+ "single_word": false,
1043
+ "special": false
1044
+ },
1045
+ "50386": {
1046
+ "content": "<|extratoken_130|>",
1047
+ "lstrip": false,
1048
+ "normalized": true,
1049
+ "rstrip": false,
1050
+ "single_word": false,
1051
+ "special": false
1052
+ },
1053
+ "50387": {
1054
+ "content": "<|extratoken_131|>",
1055
+ "lstrip": false,
1056
+ "normalized": true,
1057
+ "rstrip": false,
1058
+ "single_word": false,
1059
+ "special": false
1060
+ },
1061
+ "50388": {
1062
+ "content": "<|extratoken_132|>",
1063
+ "lstrip": false,
1064
+ "normalized": true,
1065
+ "rstrip": false,
1066
+ "single_word": false,
1067
+ "special": false
1068
+ },
1069
+ "50389": {
1070
+ "content": "<|extratoken_133|>",
1071
+ "lstrip": false,
1072
+ "normalized": true,
1073
+ "rstrip": false,
1074
+ "single_word": false,
1075
+ "special": false
1076
+ },
1077
+ "50390": {
1078
+ "content": "<|extratoken_134|>",
1079
+ "lstrip": false,
1080
+ "normalized": true,
1081
+ "rstrip": false,
1082
+ "single_word": false,
1083
+ "special": false
1084
+ },
1085
+ "50391": {
1086
+ "content": "<|extratoken_135|>",
1087
+ "lstrip": false,
1088
+ "normalized": true,
1089
+ "rstrip": false,
1090
+ "single_word": false,
1091
+ "special": false
1092
+ },
1093
+ "50392": {
1094
+ "content": "<|extratoken_136|>",
1095
+ "lstrip": false,
1096
+ "normalized": true,
1097
+ "rstrip": false,
1098
+ "single_word": false,
1099
+ "special": false
1100
+ },
1101
+ "50393": {
1102
+ "content": "<|extratoken_137|>",
1103
+ "lstrip": false,
1104
+ "normalized": true,
1105
+ "rstrip": false,
1106
+ "single_word": false,
1107
+ "special": false
1108
+ },
1109
+ "50394": {
1110
+ "content": "<|extratoken_138|>",
1111
+ "lstrip": false,
1112
+ "normalized": true,
1113
+ "rstrip": false,
1114
+ "single_word": false,
1115
+ "special": false
1116
+ },
1117
+ "50395": {
1118
+ "content": "<|extratoken_139|>",
1119
+ "lstrip": false,
1120
+ "normalized": true,
1121
+ "rstrip": false,
1122
+ "single_word": false,
1123
+ "special": false
1124
+ },
1125
+ "50396": {
1126
+ "content": "<|extratoken_140|>",
1127
+ "lstrip": false,
1128
+ "normalized": true,
1129
+ "rstrip": false,
1130
+ "single_word": false,
1131
+ "special": false
1132
+ },
1133
+ "50397": {
1134
+ "content": "<|extratoken_141|>",
1135
+ "lstrip": false,
1136
+ "normalized": true,
1137
+ "rstrip": false,
1138
+ "single_word": false,
1139
+ "special": false
1140
+ },
1141
+ "50398": {
1142
+ "content": "<|extratoken_142|>",
1143
+ "lstrip": false,
1144
+ "normalized": true,
1145
+ "rstrip": false,
1146
+ "single_word": false,
1147
+ "special": false
1148
+ },
1149
+ "50399": {
1150
+ "content": "<|extratoken_143|>",
1151
+ "lstrip": false,
1152
+ "normalized": true,
1153
+ "rstrip": false,
1154
+ "single_word": false,
1155
+ "special": false
1156
+ }
1157
+ },
1158
+ "bos_token": "<|endoftext|>",
1159
+ "clean_up_tokenization_spaces": true,
1160
+ "eos_token": "<|endoftext|>",
1161
+ "errors": "replace",
1162
+ "legacy": false,
1163
+ "model_max_length": 2048,
1164
+ "pad_token": null,
1165
+ "tokenizer_class": "GPT2Tokenizer",
1166
+ "unk_token": "<|endoftext|>"
1167
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff