nz commited on
Commit
a599f27
1 Parent(s): e621107

Upload tokenizer

Browse files
Files changed (4) hide show
  1. README.md +199 -0
  2. special_tokens_map.json +5 -0
  3. tokenizer.json +1833 -0
  4. tokenizer_config.json +18 -0
README.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
special_tokens_map.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<|endoftext|>",
3
+ "eos_token": "<|endoftext|>",
4
+ "unk_token": "<|endoftext|>"
5
+ }
tokenizer.json ADDED
@@ -0,0 +1,1833 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
5
+ "added_tokens": [
6
+ {
7
+ "id": 0,
8
+ "content": "<|endoftext|>",
9
+ "single_word": false,
10
+ "lstrip": false,
11
+ "rstrip": false,
12
+ "normalized": false,
13
+ "special": true
14
+ }
15
+ ],
16
+ "normalizer": null,
17
+ "pre_tokenizer": {
18
+ "type": "ByteLevel",
19
+ "add_prefix_space": false,
20
+ "trim_offsets": true,
21
+ "use_regex": true
22
+ },
23
+ "post_processor": {
24
+ "type": "ByteLevel",
25
+ "add_prefix_space": true,
26
+ "trim_offsets": true,
27
+ "use_regex": true
28
+ },
29
+ "decoder": null,
30
+ "model": {
31
+ "type": "BPE",
32
+ "dropout": null,
33
+ "unk_token": "<|endoftext|>",
34
+ "continuing_subword_prefix": null,
35
+ "end_of_word_suffix": null,
36
+ "fuse_unk": false,
37
+ "byte_fallback": false,
38
+ "ignore_merges": false,
39
+ "vocab": {
40
+ "<|endoftext|>": 0,
41
+ "\u0000": 1,
42
+ "\u0001": 2,
43
+ "\u0002": 3,
44
+ "\u0003": 4,
45
+ "\u0004": 5,
46
+ "\u0005": 6,
47
+ "\u0006": 7,
48
+ "\u0007": 8,
49
+ "\b": 9,
50
+ "\t": 10,
51
+ "\n": 11,
52
+ "\u000b": 12,
53
+ "\f": 13,
54
+ "\r": 14,
55
+ "\u000e": 15,
56
+ "\u000f": 16,
57
+ "\u0010": 17,
58
+ "\u0011": 18,
59
+ "\u0012": 19,
60
+ "\u0013": 20,
61
+ "\u0014": 21,
62
+ "\u0015": 22,
63
+ "\u0016": 23,
64
+ "\u0017": 24,
65
+ "\u0018": 25,
66
+ "\u0019": 26,
67
+ "\u001a": 27,
68
+ "\u001b": 28,
69
+ "\u001c": 29,
70
+ "\u001d": 30,
71
+ "\u001e": 31,
72
+ "\u001f": 32,
73
+ " ": 33,
74
+ "!": 34,
75
+ "\"": 35,
76
+ "#": 36,
77
+ "$": 37,
78
+ "%": 38,
79
+ "&": 39,
80
+ "'": 40,
81
+ "(": 41,
82
+ ")": 42,
83
+ "*": 43,
84
+ "+": 44,
85
+ ",": 45,
86
+ "-": 46,
87
+ ".": 47,
88
+ "/": 48,
89
+ "0": 49,
90
+ "1": 50,
91
+ "2": 51,
92
+ "3": 52,
93
+ "4": 53,
94
+ "5": 54,
95
+ "6": 55,
96
+ "7": 56,
97
+ "8": 57,
98
+ "9": 58,
99
+ ":": 59,
100
+ ";": 60,
101
+ "<": 61,
102
+ "=": 62,
103
+ ">": 63,
104
+ "?": 64,
105
+ "@": 65,
106
+ "A": 66,
107
+ "B": 67,
108
+ "C": 68,
109
+ "D": 69,
110
+ "E": 70,
111
+ "F": 71,
112
+ "G": 72,
113
+ "H": 73,
114
+ "I": 74,
115
+ "J": 75,
116
+ "K": 76,
117
+ "L": 77,
118
+ "M": 78,
119
+ "N": 79,
120
+ "O": 80,
121
+ "P": 81,
122
+ "Q": 82,
123
+ "R": 83,
124
+ "S": 84,
125
+ "T": 85,
126
+ "U": 86,
127
+ "V": 87,
128
+ "W": 88,
129
+ "X": 89,
130
+ "Y": 90,
131
+ "Z": 91,
132
+ "[": 92,
133
+ "\\": 93,
134
+ "]": 94,
135
+ "^": 95,
136
+ "_": 96,
137
+ "`": 97,
138
+ "a": 98,
139
+ "b": 99,
140
+ "c": 100,
141
+ "d": 101,
142
+ "e": 102,
143
+ "f": 103,
144
+ "g": 104,
145
+ "h": 105,
146
+ "i": 106,
147
+ "j": 107,
148
+ "k": 108,
149
+ "l": 109,
150
+ "m": 110,
151
+ "n": 111,
152
+ "o": 112,
153
+ "p": 113,
154
+ "q": 114,
155
+ "r": 115,
156
+ "s": 116,
157
+ "t": 117,
158
+ "u": 118,
159
+ "v": 119,
160
+ "w": 120,
161
+ "x": 121,
162
+ "y": 122,
163
+ "z": 123,
164
+ "{": 124,
165
+ "|": 125,
166
+ "}": 126,
167
+ "~": 127,
168
+ "": 128,
169
+ "€": 129,
170
+ "": 130,
171
+ "‚": 131,
172
+ "ƒ": 132,
173
+ "„": 133,
174
+ "…": 134,
175
+ "†": 135,
176
+ "‡": 136,
177
+ "ˆ": 137,
178
+ "‰": 138,
179
+ "Š": 139,
180
+ "‹": 140,
181
+ "Œ": 141,
182
+ "": 142,
183
+ "Ž": 143,
184
+ "": 144,
185
+ "": 145,
186
+ "‘": 146,
187
+ "’": 147,
188
+ "“": 148,
189
+ "”": 149,
190
+ "•": 150,
191
+ "–": 151,
192
+ "—": 152,
193
+ "˜": 153,
194
+ "™": 154,
195
+ "š": 155,
196
+ "›": 156,
197
+ "œ": 157,
198
+ "": 158,
199
+ "ž": 159,
200
+ "Ÿ": 160,
201
+ " ": 161,
202
+ "¡": 162,
203
+ "¢": 163,
204
+ "£": 164,
205
+ "¤": 165,
206
+ "¥": 166,
207
+ "¦": 167,
208
+ "§": 168,
209
+ "¨": 169,
210
+ "©": 170,
211
+ "ª": 171,
212
+ "«": 172,
213
+ "¬": 173,
214
+ "­": 174,
215
+ "®": 175,
216
+ "¯": 176,
217
+ "°": 177,
218
+ "±": 178,
219
+ "²": 179,
220
+ "³": 180,
221
+ "´": 181,
222
+ "µ": 182,
223
+ "¶": 183,
224
+ "·": 184,
225
+ "¸": 185,
226
+ "¹": 186,
227
+ "º": 187,
228
+ "»": 188,
229
+ "¼": 189,
230
+ "½": 190,
231
+ "¾": 191,
232
+ "¿": 192,
233
+ "À": 193,
234
+ "Á": 194,
235
+ "Â": 195,
236
+ "Ã": 196,
237
+ "Ä": 197,
238
+ "Å": 198,
239
+ "Æ": 199,
240
+ "Ç": 200,
241
+ "È": 201,
242
+ "É": 202,
243
+ "Ê": 203,
244
+ "Ë": 204,
245
+ "Ì": 205,
246
+ "Í": 206,
247
+ "Î": 207,
248
+ "Ï": 208,
249
+ "Ð": 209,
250
+ "Ñ": 210,
251
+ "Ò": 211,
252
+ "Ó": 212,
253
+ "Ô": 213,
254
+ "Õ": 214,
255
+ "Ö": 215,
256
+ "×": 216,
257
+ "Ø": 217,
258
+ "Ù": 218,
259
+ "Ú": 219,
260
+ "Û": 220,
261
+ "Ü": 221,
262
+ "Ý": 222,
263
+ "Þ": 223,
264
+ "ß": 224,
265
+ "à": 225,
266
+ "á": 226,
267
+ "â": 227,
268
+ "ã": 228,
269
+ "ä": 229,
270
+ "å": 230,
271
+ "æ": 231,
272
+ "ç": 232,
273
+ "è": 233,
274
+ "é": 234,
275
+ "ê": 235,
276
+ "ë": 236,
277
+ "ì": 237,
278
+ "í": 238,
279
+ "î": 239,
280
+ "ï": 240,
281
+ "ð": 241,
282
+ "ñ": 242,
283
+ "ò": 243,
284
+ "ó": 244,
285
+ "ô": 245,
286
+ "õ": 246,
287
+ "ö": 247,
288
+ "÷": 248,
289
+ "ø": 249,
290
+ "ù": 250,
291
+ "ú": 251,
292
+ "û": 252,
293
+ "ü": 253,
294
+ "ý": 254,
295
+ "þ": 255,
296
+ "ÿ": 256,
297
+ "Ċ": 257,
298
+ "Ġ": 258,
299
+ "Ġt": 259,
300
+ "he": 260,
301
+ "Ġa": 261,
302
+ "Ġs": 262,
303
+ "nd": 263,
304
+ "Ġw": 264,
305
+ "Ġthe": 265,
306
+ "ed": 266,
307
+ "Ġb": 267,
308
+ "Ġto": 268,
309
+ "Ġand": 269,
310
+ "Ġh": 270,
311
+ "Ġf": 271,
312
+ "ĠT": 272,
313
+ "in": 273,
314
+ "Ġwa": 274,
315
+ "re": 275,
316
+ "it": 276,
317
+ "ou": 277,
318
+ "Ġl": 278,
319
+ "Ġd": 279,
320
+ "Ġc": 280,
321
+ "Ġp": 281,
322
+ "ay": 282,
323
+ "Ġm": 283,
324
+ "er": 284,
325
+ "Ġwas": 285,
326
+ "ĠThe": 286,
327
+ "om": 287,
328
+ "Ġhe": 288,
329
+ "is": 289,
330
+ "Ġn": 290,
331
+ "ar": 291,
332
+ "im": 292,
333
+ "on": 293,
334
+ "Ġsa": 294,
335
+ "id": 295,
336
+ "ll": 296,
337
+ "Ġha": 297,
338
+ "Ġg": 298,
339
+ "at": 299,
340
+ "ĠS": 300,
341
+ "ing": 301,
342
+ "ot": 302,
343
+ "en": 303,
344
+ "an": 304,
345
+ "le": 305,
346
+ "or": 306,
347
+ "end": 307,
348
+ "ir": 308,
349
+ "of": 309,
350
+ "am": 310,
351
+ "et": 311,
352
+ "ĠH": 312,
353
+ "Ġit": 313,
354
+ "Ġth": 314,
355
+ "ig": 315,
356
+ "ĠThey": 316,
357
+ "Ġin": 317,
358
+ "il": 318,
359
+ "Ġpl": 319,
360
+ "Ġ\"": 320,
361
+ "ĠHe": 321,
362
+ "ow": 322,
363
+ "ri": 323,
364
+ "ver": 324,
365
+ "ut": 325,
366
+ "Ġu": 326,
367
+ "Ġbe": 327,
368
+ "Ġplay": 328,
369
+ "Ġsaid": 329,
370
+ "ith": 330,
371
+ "Ġday": 331,
372
+ "Ġwith": 332,
373
+ "pp": 333,
374
+ "On": 334,
375
+ "Ġy": 335,
376
+ "oo": 336,
377
+ "ked": 337,
378
+ "Ġr": 338,
379
+ "ex": 339,
380
+ "Ġher": 340,
381
+ "ce": 341,
382
+ "ĠI": 342,
383
+ "ĠTim": 343,
384
+ "ĠShe": 344,
385
+ "ld": 345,
386
+ "Ġhis": 346,
387
+ "Ġst": 347,
388
+ "ke": 348,
389
+ "Ġbig": 349,
390
+ "nt": 350,
391
+ "ck": 351,
392
+ "very": 352,
393
+ "Ġyou": 353,
394
+ "st": 354,
395
+ "ve": 355,
396
+ "Ġhapp": 356,
397
+ "un": 357,
398
+ "Ġon": 358,
399
+ "riend": 359,
400
+ "Ġfriend": 360,
401
+ "all": 361,
402
+ "ily": 362,
403
+ "ext": 363,
404
+ "ĠL": 364,
405
+ "Ġthey": 365,
406
+ "oft": 366,
407
+ "Ġwe": 367,
408
+ "Ġhad": 368,
409
+ "Ġnot": 369,
410
+ "Ġli": 370,
411
+ "Ġup": 371,
412
+ "her": 372,
413
+ "Ġwant": 373,
414
+ "Ġof": 374,
415
+ "itt": 375,
416
+ "<|": 376,
417
+ "|>": 377,
418
+ "endoft": 378,
419
+ "endoftext": 379,
420
+ "ad": 380,
421
+ "se": 381,
422
+ "ĠB": 382,
423
+ "Ġdo": 383,
424
+ "Ġe": 384,
425
+ "Ġhappy": 385,
426
+ "Ġvery": 386,
427
+ "ent": 387,
428
+ "ĠM": 388,
429
+ "'s": 389,
430
+ "es": 390,
431
+ "Ġsaw": 391,
432
+ "One": 392,
433
+ "Ġthat": 393,
434
+ "ould": 394,
435
+ "Ġmom": 395,
436
+ "Ġfor": 396,
437
+ "ittle": 397,
438
+ "Ġsh": 398,
439
+ "Ġlittle": 399,
440
+ "Ġshe": 400,
441
+ ".\"": 401,
442
+ "ime": 402,
443
+ "Ġnam": 403,
444
+ "ch": 404,
445
+ "Ġtime": 405,
446
+ "Ġk": 406,
447
+ "ound": 407,
448
+ "Ġso": 408,
449
+ "Ġthere": 409,
450
+ "Ġnamed": 410,
451
+ "Ġbo": 411,
452
+ "Ġsm": 412,
453
+ "ĠLily": 413,
454
+ "Ġwere": 414,
455
+ "Ġwanted": 415,
456
+ "Ġne": 416,
457
+ "!\"": 417,
458
+ "Ġbut": 418,
459
+ "Ġfriends": 419,
460
+ "out": 420,
461
+ "ved": 421,
462
+ "ĠTom": 422,
463
+ "The": 423,
464
+ "ird": 424,
465
+ "ht": 425,
466
+ "el": 426,
467
+ "Ġbird": 427,
468
+ "Ġan": 428,
469
+ "al": 429,
470
+ "ake": 430,
471
+ "Ġtoo": 431,
472
+ "ĠIt": 432,
473
+ "ug": 433,
474
+ "ome": 434,
475
+ "Ġwent": 435,
476
+ "ide": 436,
477
+ "Ġhel": 437,
478
+ "Once": 438,
479
+ "Ġwh": 439,
480
+ "Ġall": 440,
481
+ "Ġis": 441,
482
+ "Ġhelp": 442,
483
+ "ue": 443,
484
+ "Ġlo": 444,
485
+ "Ġloo": 445,
486
+ "ter": 446,
487
+ "Ġupon": 447,
488
+ "ĠA": 448,
489
+ "ry": 449,
490
+ "ore": 450,
491
+ "Ġfun": 451,
492
+ "ind": 452,
493
+ "Ġtoy": 453,
494
+ "get": 454,
495
+ "ame": 455,
496
+ "ill": 456,
497
+ "Ġas": 457,
498
+ "Ġat": 458,
499
+ "ra": 459,
500
+ "Ġdid": 460,
501
+ "Ġj": 461,
502
+ "gether": 462,
503
+ "Ġre": 463,
504
+ "ur": 464,
505
+ "Ġo": 465,
506
+ "Ġtogether": 466,
507
+ "Ġse": 467,
508
+ "Ġcat": 468,
509
+ "ack": 469,
510
+ "Ġtre": 470,
511
+ "ly": 471,
512
+ "ood": 472,
513
+ "ic": 473,
514
+ "Ġdog": 474,
515
+ "ted": 475,
516
+ "Ġcould": 476,
517
+ "Ġcan": 477,
518
+ "Ġtheir": 478,
519
+ "ard": 479,
520
+ "ark": 480,
521
+ "?\"": 481,
522
+ "ec": 482,
523
+ "Ġplayed": 483,
524
+ "Ġball": 484,
525
+ "Ġgir": 485,
526
+ "Ġro": 486,
527
+ "Ġhim": 487,
528
+ "Ġgirl": 488,
529
+ "way": 489,
530
+ "hed": 490,
531
+ "Ġgo": 491,
532
+ "my": 492,
533
+ "Ġle": 493,
534
+ "Ġare": 494,
535
+ "'t": 495,
536
+ "Ġout": 496,
537
+ "Ġfr": 497,
538
+ "ain": 498,
539
+ "Ġkn": 499,
540
+ "hen": 500,
541
+ "Ġthem": 501,
542
+ "um": 502,
543
+ "ax": 503,
544
+ "Ġsad": 504,
545
+ "Ġboy": 505,
546
+ "Ġtree": 506,
547
+ "ul": 507,
548
+ "other": 508,
549
+ "Ġman": 509,
550
+ "Ġhave": 510,
551
+ "Ġloved": 511,
552
+ "Ġcl": 512,
553
+ "Ġfound": 513,
554
+ "Ġlooked": 514,
555
+ "oug": 515,
556
+ "ĠSue": 516,
557
+ "Ġsp": 517,
558
+ "Ġstar": 518,
559
+ "one": 519,
560
+ "Ġsc": 520,
561
+ "hing": 521,
562
+ "Ġback": 522,
563
+ "ĠMax": 523,
564
+ "own": 524,
565
+ "are": 525,
566
+ "Ġlike": 526,
567
+ "Ġbec": 527,
568
+ "side": 528,
569
+ "ful": 529,
570
+ "Ġme": 530,
571
+ "Ġpark": 531,
572
+ "ong": 532,
573
+ "Ġcar": 533,
574
+ "ight": 534,
575
+ "op": 535,
576
+ "ĠOne": 536,
577
+ "elt": 537,
578
+ "Ġliked": 538,
579
+ "Ġwould": 539,
580
+ "Ġla": 540,
581
+ "Ġmake": 541,
582
+ "Ġfa": 542,
583
+ "Ġfelt": 543,
584
+ "round": 544,
585
+ "You": 545,
586
+ "ell": 546,
587
+ "ĠW": 547,
588
+ "Ġsee": 548,
589
+ "ĠBut": 549,
590
+ "omet": 550,
591
+ "Ġasked": 551,
592
+ "Ġnew": 552,
593
+ "ag": 553,
594
+ "ĠSam": 554,
595
+ "ouse": 555,
596
+ "Ġcame": 556,
597
+ "ared": 557,
598
+ "Ġstarted": 558,
599
+ "Ġno": 559,
600
+ "ice": 560,
601
+ "ĠBen": 561,
602
+ "ought": 562,
603
+ "Ġother": 563,
604
+ "Ġal": 564,
605
+ "iled": 565,
606
+ "Ġag": 566,
607
+ "Ġsmall": 567,
608
+ "Ġgood": 568,
609
+ "Ġsomet": 569,
610
+ "Ġbr": 570,
611
+ "ss": 571,
612
+ "ried": 572,
613
+ "ade": 573,
614
+ "Ġsmiled": 574,
615
+ "ings": 575,
616
+ "Ġsay": 576,
617
+ "ob": 577,
618
+ "pot": 578,
619
+ "Ġfind": 579,
620
+ "Ġwor": 580,
621
+ "ia": 581,
622
+ "ty": 582,
623
+ "Ġaway": 583,
624
+ "Ġput": 584,
625
+ "Ġmade": 585,
626
+ "Ġthought": 586,
627
+ "ened": 587,
628
+ "Ġfrom": 588,
629
+ "Ġwhat": 589,
630
+ "Ġhome": 590,
631
+ "Ġsomething": 591,
632
+ "Ġplaying": 592,
633
+ "Ġex": 593,
634
+ "Ġco": 594,
635
+ "Ġevery": 595,
636
+ "ook": 596,
637
+ "Ġwal": 597,
638
+ "uc": 598,
639
+ "Ġmu": 599,
640
+ "ach": 600,
641
+ "ĠSpot": 601,
642
+ "arn": 602,
643
+ "ĠF": 603,
644
+ "Ġran": 604,
645
+ "ile": 605,
646
+ "ie": 606,
647
+ "ave": 607,
648
+ "Ġagain": 608,
649
+ "Ġlaug": 609,
650
+ "Ġhouse": 610,
651
+ "Ġsome": 611,
652
+ "ĠJ": 612,
653
+ "Ġdown": 613,
654
+ "Ġfl": 614,
655
+ "dd": 615,
656
+ "Ġtook": 616,
657
+ "Ġscared": 617,
658
+ "Ġtoys": 618,
659
+ "king": 619,
660
+ "Ġlearn": 620,
661
+ "ny": 621,
662
+ "Ġpr": 622,
663
+ "Ġbox": 623,
664
+ "ure": 624,
665
+ "Ġwill": 625,
666
+ "if": 626,
667
+ "ret": 627,
668
+ "ĠYou": 628,
669
+ "ab": 629,
670
+ "ick": 630,
671
+ "ep": 631,
672
+ "Ġthings": 632,
673
+ "Ġmy": 633,
674
+ "Ġyour": 634,
675
+ "Ġaround": 635,
676
+ "Ġbl": 636,
677
+ "Ġlived": 637,
678
+ "oud": 638,
679
+ "ish": 639,
680
+ "uck": 640,
681
+ "Ġwhen": 641,
682
+ "Ġsun": 642,
683
+ ",\"": 643,
684
+ "Ġfe": 644,
685
+ "Ġthen": 645,
686
+ "as": 646,
687
+ "Ġsw": 647,
688
+ "Ġch": 648,
689
+ "us": 649,
690
+ "pped": 650,
691
+ "ĠMia": 651,
692
+ "Ġab": 652,
693
+ "ank": 653,
694
+ "Tim": 654,
695
+ "ucy": 655,
696
+ "ump": 656,
697
+ "Ġget": 657,
698
+ "Ġlot": 658,
699
+ "Th": 659,
700
+ "ist": 660,
701
+ "oth": 661,
702
+ "Ġtried": 662,
703
+ "ap": 663,
704
+ "Ġknow": 664,
705
+ "Ġgot": 665,
706
+ "Ġsays": 666,
707
+ "Ġkne": 667,
708
+ "Ġwho": 668,
709
+ "Ġmany": 669,
710
+ "ited": 670,
711
+ "ust": 671,
712
+ "nder": 672,
713
+ "Ġint": 673,
714
+ "Ġabout": 674,
715
+ "Lily": 675,
716
+ "Ġpret": 676,
717
+ "Ġred": 677,
718
+ "Ġany": 678,
719
+ "Ġdec": 679,
720
+ "ive": 680,
721
+ "ĠD": 681,
722
+ "Ġknew": 682,
723
+ "ace": 683,
724
+ "Ġmore": 684,
725
+ "ous": 685,
726
+ "ise": 686,
727
+ "Ġpic": 687,
728
+ "au": 688,
729
+ "Ġcare": 689,
730
+ "Ġv": 690,
731
+ "Ġlearned": 691,
732
+ "ally": 692,
733
+ "ĠLucy": 693,
734
+ "Ġbecame": 694,
735
+ "qu": 695,
736
+ "Ġwater": 696,
737
+ "Ġhug": 697,
738
+ "fter": 698,
739
+ "Ġbest": 699,
740
+ "Ġpo": 700,
741
+ "ause": 701,
742
+ "Ġgre": 702,
743
+ "Ġop": 703,
744
+ "ways": 704,
745
+ "urp": 705,
746
+ "Ġlaughed": 706,
747
+ "Ġoutside": 707,
748
+ "Ġalways": 708,
749
+ "Ġexc": 709,
750
+ "Ġun": 710,
751
+ "Ġlook": 711,
752
+ "ĠBob": 712,
753
+ "Ġbecause": 713,
754
+ "Ġshow": 714,
755
+ "Ġdecid": 715,
756
+ "Ġroom": 716,
757
+ "ant": 717,
758
+ "ĠSo": 718,
759
+ "Ġeat": 719,
760
+ "fe": 720,
761
+ "Ġho": 721,
762
+ "Ġdecided": 722,
763
+ "Ġinto": 723,
764
+ "Ġjump": 724,
765
+ "ite": 725,
766
+ "ĠAnd": 726,
767
+ "Ġboth": 727,
768
+ "Ġpe": 728,
769
+ "ers": 729,
770
+ "ĠMom": 730,
771
+ "Ġdad": 731,
772
+ "They": 732,
773
+ "Ġke": 733,
774
+ "udd": 734,
775
+ "Ġone": 735,
776
+ "Ġfast": 736,
777
+ "Ġnice": 737,
778
+ "nn": 738,
779
+ "Ġrun": 739,
780
+ "Ġthis": 740,
781
+ "Tom": 741,
782
+ "Yes": 742,
783
+ "Ġlong": 743,
784
+ "Ġfeel": 744,
785
+ "Ġexcited": 745,
786
+ "ĠE": 746,
787
+ "Ġtold": 747,
788
+ "Ġsk": 748,
789
+ "Ġam": 749,
790
+ "urpr": 750,
791
+ "Ġinside": 751,
792
+ "Ġtr": 752,
793
+ "ull": 753,
794
+ "our": 754,
795
+ "Ġsurpr": 755,
796
+ "Ġpretty": 756,
797
+ "Ġmo": 757,
798
+ "iny": 758,
799
+ "ink": 759,
800
+ "Ġsor": 760,
801
+ "Wh": 761,
802
+ "Ġtake": 762,
803
+ "og": 763,
804
+ "Ġgave": 764,
805
+ "lew": 765,
806
+ "Ġrock": 766,
807
+ "Ġsl": 767,
808
+ "Ġeach": 768,
809
+ "Ġmuch": 769,
810
+ "Ġstr": 770,
811
+ "imal": 771,
812
+ "Ġhow": 772,
813
+ "Ġgra": 773,
814
+ "Ġanimal": 774,
815
+ "nna": 775,
816
+ "ara": 776,
817
+ "Ġneed": 777,
818
+ "ged": 778,
819
+ "Ġtow": 779,
820
+ "etter": 780,
821
+ "Ġthan": 781,
822
+ "But": 782,
823
+ "ven": 783,
824
+ "Ġor": 784,
825
+ "Ġunder": 785,
826
+ "ĠC": 786,
827
+ "ess": 787,
828
+ "Ġsorry": 788,
829
+ "Ġold": 789,
830
+ "ised": 790,
831
+ "ge": 791,
832
+ "ro": 792,
833
+ "urt": 793,
834
+ "Ġfish": 794,
835
+ "Ġcle": 795,
836
+ "Ġwalked": 796,
837
+ "Ġbear": 797,
838
+ "and": 798,
839
+ "Ġclo": 799,
840
+ "ase": 800,
841
+ "ast": 801,
842
+ "Ġhand": 802,
843
+ "urn": 803,
844
+ "Ġkind": 804,
845
+ "ĠHis": 805,
846
+ "ĠWe": 806,
847
+ "Ġhappened": 807,
848
+ "Ġflow": 808,
849
+ "Ġfood": 809,
850
+ "here": 810,
851
+ "Ġlist": 811,
852
+ "Ġhig": 812,
853
+ "Ġanimals": 813,
854
+ "Ġjust": 814,
855
+ "Ġte": 815,
856
+ "Ġdidn": 816,
857
+ "Ġnear": 817,
858
+ "Ġide": 818,
859
+ "Ġsky": 819,
860
+ "Ġwat": 820,
861
+ "rom": 821,
862
+ "Ġtry": 822,
863
+ "ine": 823,
864
+ "Ġsn": 824,
865
+ "Ġfi": 825,
866
+ "ched": 826,
867
+ "ĠAmy": 827,
868
+ "ving": 828,
869
+ "Ġbug": 829,
870
+ "Ġidea": 830,
871
+ "Ġbetter": 831,
872
+ "Ġus": 832,
873
+ "pl": 833,
874
+ "gry": 834,
875
+ "Ġits": 835,
876
+ "pec": 836,
877
+ "Ġheard": 837,
878
+ "Ġtw": 838,
879
+ "Ġlet": 839,
880
+ "ff": 840,
881
+ "able": 841,
882
+ "ate": 842,
883
+ "Ġshare": 843,
884
+ "Ġcareful": 844,
885
+ "Thank": 845,
886
+ "Ġen": 846,
887
+ "more": 847,
888
+ "Ġanymore": 848,
889
+ "Ġfly": 849,
890
+ "Ġflew": 850,
891
+ "Ġstor": 851,
892
+ "Ġif": 852,
893
+ "Mom": 853,
894
+ "ial": 854,
895
+ "Ġlots": 855,
896
+ "ĠTh": 856,
897
+ "Ġcom": 857,
898
+ "Ġspec": 858,
899
+ "Ġdan": 859,
900
+ "Ġspecial": 860,
901
+ "ion": 861,
902
+ "Ġby": 862,
903
+ "Ġnever": 863,
904
+ "ream": 864,
905
+ "lf": 865,
906
+ "Ġwind": 866,
907
+ "Ġbu": 867,
908
+ "Ġclean": 868,
909
+ "Ġfo": 869,
910
+ "Ġtal": 870,
911
+ "Ġdon": 871,
912
+ "Ġgr": 872,
913
+ "ort": 873,
914
+ "rm": 874,
915
+ "Ġend": 875,
916
+ "ople": 876,
917
+ "Ġlove": 877,
918
+ "ĠThen": 878,
919
+ "Ġeven": 879,
920
+ "ber": 880,
921
+ "Ġmag": 881,
922
+ "Ġshiny": 882,
923
+ "Ġhard": 883,
924
+ "Ġcake": 884,
925
+ "Ġfore": 885,
926
+ "Ġover": 886,
927
+ "ak": 887,
928
+ "Ġcol": 888,
929
+ "Ġbook": 889,
930
+ "udden": 890,
931
+ "Ġturn": 891,
932
+ "Ġsafe": 892,
933
+ "Ġfam": 893,
934
+ "Ġafter": 894,
935
+ "Ġbad": 895,
936
+ "pected": 896,
937
+ "Ġpeople": 897,
938
+ "Ġsurprised": 898,
939
+ "Ġhigh": 899,
940
+ "Ġproud": 900,
941
+ "ady": 901,
942
+ "ĠAnna": 902,
943
+ "Ġhurt": 903,
944
+ "imb": 904,
945
+ "ĠEvery": 905,
946
+ "Let": 906,
947
+ "expected": 907,
948
+ "Ġunexpected": 908,
949
+ "uddenly": 909,
950
+ "Ġpicked": 910,
951
+ "Ġground": 911,
952
+ "Ġcu": 912,
953
+ "Ġclimb": 913,
954
+ "Ġdoor": 914,
955
+ "Ġcome": 915,
956
+ "arden": 916,
957
+ "Ġgarden": 917,
958
+ "Ġopened": 918,
959
+ "ble": 919,
960
+ "Ġloud": 920,
961
+ "As": 921,
962
+ "Ġgl": 922,
963
+ "Ġche": 923,
964
+ "Ġim": 924,
965
+ "ild": 925,
966
+ "'m": 926,
967
+ "Ġgive": 927,
968
+ "ail": 928,
969
+ "Ġcolor": 929,
970
+ "Ġblue": 930,
971
+ "Ġway": 931,
972
+ "Ġever": 932,
973
+ "Ġthanked": 933,
974
+ "ĠFrom": 934,
975
+ "Ġstill": 935,
976
+ "Ġfar": 936,
977
+ "Ġhugged": 937,
978
+ "ĠHer": 938,
979
+ "ip": 939,
980
+ "ĠWhen": 940,
981
+ "Ġcall": 941,
982
+ "No": 942,
983
+ "Ġmagic": 943,
984
+ "ĠSara": 944,
985
+ "ummy": 945,
986
+ "ĠK": 946,
987
+ "age": 947,
988
+ "Ġoff": 948,
989
+ "iz": 949,
990
+ "Ġjumped": 950,
991
+ "ough": 951,
992
+ "Ġpar": 952,
993
+ "Ġfamily": 953,
994
+ "Ġshould": 954,
995
+ "Ġkid": 955,
996
+ "ool": 956,
997
+ "uff": 957,
998
+ "Ġsmile": 958,
999
+ "hes": 959,
1000
+ "Ġplace": 960,
1001
+ "ĠIn": 961,
1002
+ "kay": 962,
1003
+ "Ġwalk": 963,
1004
+ "Ġgreat": 964,
1005
+ "Ġnow": 965,
1006
+ "Ġstrong": 966,
1007
+ "ct": 967,
1008
+ "em": 968,
1009
+ "Ġstay": 969,
1010
+ "itty": 970,
1011
+ "ture": 971,
1012
+ "Ġqu": 972,
1013
+ "Ġforest": 973,
1014
+ "Ġunt": 974,
1015
+ "Ġsto": 975,
1016
+ "aut": 976,
1017
+ "ane": 977,
1018
+ "Ġbro": 978,
1019
+ "Ġbra": 979,
1020
+ "oon": 980,
1021
+ "Ġsqu": 981,
1022
+ "ĠP": 982,
1023
+ "Ġboat": 983,
1024
+ "Ġstick": 984,
1025
+ "Ġuntil": 985,
1026
+ "Ġfrog": 986,
1027
+ "Ġbeaut": 987,
1028
+ "dy": 988,
1029
+ "Ġnext": 989,
1030
+ "lease": 990,
1031
+ "Ġhappily": 991,
1032
+ "ning": 992,
1033
+ "Ġlisten": 993,
1034
+ "Ġkids": 994,
1035
+ "Ġtra": 995,
1036
+ "Ġhelped": 996,
1037
+ "aking": 997,
1038
+ "Ġapp": 998,
1039
+ "iful": 999,
1040
+ "Ġbeautiful": 1000,
1041
+ "Ġshowed": 1001,
1042
+ "th": 1002,
1043
+ "ies": 1003,
1044
+ "Ġdra": 1004,
1045
+ "Ġstory": 1005,
1046
+ "unny": 1006,
1047
+ "Ġtown": 1007,
1048
+ "by": 1008,
1049
+ "Ġimp": 1009,
1050
+ "rel": 1010,
1051
+ "Ġwhile": 1011,
1052
+ "Ġclos": 1012,
1053
+ "be": 1013,
1054
+ "oy": 1014,
1055
+ "Ġrain": 1015,
1056
+ "Ġpicture": 1016,
1057
+ "ress": 1017,
1058
+ "pt": 1018,
1059
+ "Ġbeing": 1019,
1060
+ "Ġeveryone": 1020,
1061
+ "Ġrem": 1021,
1062
+ "Ġhat": 1022,
1063
+ "Ġmor": 1023
1064
+ },
1065
+ "merges": [
1066
+ "Ġ t",
1067
+ "h e",
1068
+ "Ġ a",
1069
+ "Ġ s",
1070
+ "n d",
1071
+ "Ġ w",
1072
+ "Ġt he",
1073
+ "e d",
1074
+ "Ġ b",
1075
+ "Ġt o",
1076
+ "Ġa nd",
1077
+ "Ġ h",
1078
+ "Ġ f",
1079
+ "Ġ T",
1080
+ "i n",
1081
+ "Ġw a",
1082
+ "r e",
1083
+ "i t",
1084
+ "o u",
1085
+ "Ġ l",
1086
+ "Ġ d",
1087
+ "Ġ c",
1088
+ "Ġ p",
1089
+ "a y",
1090
+ "Ġ m",
1091
+ "e r",
1092
+ "Ġwa s",
1093
+ "ĠT he",
1094
+ "o m",
1095
+ "Ġ he",
1096
+ "i s",
1097
+ "Ġ n",
1098
+ "a r",
1099
+ "i m",
1100
+ "o n",
1101
+ "Ġs a",
1102
+ "i d",
1103
+ "l l",
1104
+ "Ġh a",
1105
+ "Ġ g",
1106
+ "a t",
1107
+ "Ġ S",
1108
+ "in g",
1109
+ "o t",
1110
+ "e n",
1111
+ "a n",
1112
+ "l e",
1113
+ "o r",
1114
+ "e nd",
1115
+ "i r",
1116
+ "o f",
1117
+ "a m",
1118
+ "e t",
1119
+ "Ġ H",
1120
+ "Ġ it",
1121
+ "Ġt h",
1122
+ "i g",
1123
+ "ĠThe y",
1124
+ "Ġ in",
1125
+ "i l",
1126
+ "Ġp l",
1127
+ "Ġ \"",
1128
+ "ĠH e",
1129
+ "o w",
1130
+ "r i",
1131
+ "v er",
1132
+ "u t",
1133
+ "Ġ u",
1134
+ "Ġb e",
1135
+ "Ġpl ay",
1136
+ "Ġsa id",
1137
+ "it h",
1138
+ "Ġd ay",
1139
+ "Ġw ith",
1140
+ "p p",
1141
+ "O n",
1142
+ "Ġ y",
1143
+ "o o",
1144
+ "k ed",
1145
+ "Ġ r",
1146
+ "e x",
1147
+ "Ġhe r",
1148
+ "c e",
1149
+ "Ġ I",
1150
+ "ĠT im",
1151
+ "ĠS he",
1152
+ "l d",
1153
+ "Ġh is",
1154
+ "Ġs t",
1155
+ "k e",
1156
+ "Ġb ig",
1157
+ "n t",
1158
+ "c k",
1159
+ "ver y",
1160
+ "Ġy ou",
1161
+ "s t",
1162
+ "v e",
1163
+ "Ġha pp",
1164
+ "u n",
1165
+ "Ġ on",
1166
+ "ri end",
1167
+ "Ġf riend",
1168
+ "a ll",
1169
+ "il y",
1170
+ "ex t",
1171
+ "Ġ L",
1172
+ "Ġthe y",
1173
+ "of t",
1174
+ "Ġw e",
1175
+ "Ġha d",
1176
+ "Ġn ot",
1177
+ "Ġl i",
1178
+ "Ġu p",
1179
+ "he r",
1180
+ "Ġwa nt",
1181
+ "Ġ of",
1182
+ "it t",
1183
+ "< |",
1184
+ "| >",
1185
+ "end oft",
1186
+ "endoft ext",
1187
+ "a d",
1188
+ "s e",
1189
+ "Ġ B",
1190
+ "Ġd o",
1191
+ "Ġ e",
1192
+ "Ġhapp y",
1193
+ "Ġ very",
1194
+ "en t",
1195
+ "Ġ M",
1196
+ "' s",
1197
+ "e s",
1198
+ "Ġsa w",
1199
+ "On e",
1200
+ "Ġth at",
1201
+ "ou ld",
1202
+ "Ġm om",
1203
+ "Ġf or",
1204
+ "itt le",
1205
+ "Ġs h",
1206
+ "Ġl ittle",
1207
+ "Ġs he",
1208
+ ". \"",
1209
+ "im e",
1210
+ "Ġn am",
1211
+ "c h",
1212
+ "Ġt ime",
1213
+ "Ġ k",
1214
+ "ou nd",
1215
+ "Ġs o",
1216
+ "Ġthe re",
1217
+ "Ġnam ed",
1218
+ "Ġb o",
1219
+ "Ġs m",
1220
+ "ĠL ily",
1221
+ "Ġwe re",
1222
+ "Ġwant ed",
1223
+ "Ġn e",
1224
+ "! \"",
1225
+ "Ġb ut",
1226
+ "Ġfriend s",
1227
+ "ou t",
1228
+ "v ed",
1229
+ "ĠT om",
1230
+ "T he",
1231
+ "ir d",
1232
+ "h t",
1233
+ "e l",
1234
+ "Ġb ird",
1235
+ "Ġa n",
1236
+ "a l",
1237
+ "a ke",
1238
+ "Ġto o",
1239
+ "ĠI t",
1240
+ "u g",
1241
+ "om e",
1242
+ "Ġw ent",
1243
+ "id e",
1244
+ "Ġhe l",
1245
+ "On ce",
1246
+ "Ġw h",
1247
+ "Ġa ll",
1248
+ "Ġ is",
1249
+ "Ġhel p",
1250
+ "u e",
1251
+ "Ġl o",
1252
+ "Ġl oo",
1253
+ "t er",
1254
+ "Ġup on",
1255
+ "Ġ A",
1256
+ "r y",
1257
+ "o re",
1258
+ "Ġf un",
1259
+ "i nd",
1260
+ "Ġto y",
1261
+ "g et",
1262
+ "am e",
1263
+ "i ll",
1264
+ "Ġa s",
1265
+ "Ġa t",
1266
+ "r a",
1267
+ "Ġd id",
1268
+ "Ġ j",
1269
+ "get her",
1270
+ "Ġ re",
1271
+ "u r",
1272
+ "Ġ o",
1273
+ "Ġto gether",
1274
+ "Ġs e",
1275
+ "Ġc at",
1276
+ "a ck",
1277
+ "Ġt re",
1278
+ "l y",
1279
+ "oo d",
1280
+ "i c",
1281
+ "Ġdo g",
1282
+ "t ed",
1283
+ "Ġc ould",
1284
+ "Ġc an",
1285
+ "Ġthe ir",
1286
+ "ar d",
1287
+ "ar k",
1288
+ "? \"",
1289
+ "e c",
1290
+ "Ġplay ed",
1291
+ "Ġb all",
1292
+ "Ġg ir",
1293
+ "Ġr o",
1294
+ "Ġh im",
1295
+ "Ġgir l",
1296
+ "w ay",
1297
+ "he d",
1298
+ "Ġg o",
1299
+ "m y",
1300
+ "Ġl e",
1301
+ "Ġa re",
1302
+ "' t",
1303
+ "Ġ out",
1304
+ "Ġf r",
1305
+ "a in",
1306
+ "Ġk n",
1307
+ "he n",
1308
+ "Ġthe m",
1309
+ "u m",
1310
+ "a x",
1311
+ "Ġsa d",
1312
+ "Ġbo y",
1313
+ "Ġtre e",
1314
+ "u l",
1315
+ "ot her",
1316
+ "Ġm an",
1317
+ "Ġha ve",
1318
+ "Ġlo ved",
1319
+ "Ġc l",
1320
+ "Ġf ound",
1321
+ "Ġloo ked",
1322
+ "ou g",
1323
+ "ĠS ue",
1324
+ "Ġs p",
1325
+ "Ġst ar",
1326
+ "on e",
1327
+ "Ġs c",
1328
+ "h ing",
1329
+ "Ġb ack",
1330
+ "ĠM ax",
1331
+ "ow n",
1332
+ "a re",
1333
+ "Ġli ke",
1334
+ "Ġbe c",
1335
+ "s ide",
1336
+ "f ul",
1337
+ "Ġm e",
1338
+ "Ġp ark",
1339
+ "on g",
1340
+ "Ġc ar",
1341
+ "ig ht",
1342
+ "o p",
1343
+ "Ġ One",
1344
+ "el t",
1345
+ "Ġli ked",
1346
+ "Ġw ould",
1347
+ "Ġl a",
1348
+ "Ġm ake",
1349
+ "Ġf a",
1350
+ "Ġf elt",
1351
+ "r ound",
1352
+ "Y ou",
1353
+ "e ll",
1354
+ "Ġ W",
1355
+ "Ġse e",
1356
+ "ĠB ut",
1357
+ "om et",
1358
+ "Ġas ked",
1359
+ "Ġne w",
1360
+ "a g",
1361
+ "ĠS am",
1362
+ "ou se",
1363
+ "Ġc ame",
1364
+ "ar ed",
1365
+ "Ġstar ted",
1366
+ "Ġn o",
1367
+ "i ce",
1368
+ "ĠB en",
1369
+ "oug ht",
1370
+ "Ġ other",
1371
+ "Ġa l",
1372
+ "il ed",
1373
+ "Ġa g",
1374
+ "Ġsm all",
1375
+ "Ġg ood",
1376
+ "Ġs omet",
1377
+ "Ġb r",
1378
+ "s s",
1379
+ "ri ed",
1380
+ "ad e",
1381
+ "Ġsm iled",
1382
+ "ing s",
1383
+ "Ġs ay",
1384
+ "o b",
1385
+ "p ot",
1386
+ "Ġf ind",
1387
+ "Ġw or",
1388
+ "i a",
1389
+ "t y",
1390
+ "Ġa way",
1391
+ "Ġp ut",
1392
+ "Ġm ade",
1393
+ "Ġth ought",
1394
+ "en ed",
1395
+ "Ġfr om",
1396
+ "Ġwh at",
1397
+ "Ġh ome",
1398
+ "Ġsomet hing",
1399
+ "Ġplay ing",
1400
+ "Ġ ex",
1401
+ "Ġc o",
1402
+ "Ġe very",
1403
+ "oo k",
1404
+ "Ġwa l",
1405
+ "u c",
1406
+ "Ġm u",
1407
+ "a ch",
1408
+ "ĠS pot",
1409
+ "ar n",
1410
+ "Ġ F",
1411
+ "Ġr an",
1412
+ "i le",
1413
+ "i e",
1414
+ "a ve",
1415
+ "Ġag ain",
1416
+ "Ġla ug",
1417
+ "Ġh ouse",
1418
+ "Ġs ome",
1419
+ "Ġ J",
1420
+ "Ġd own",
1421
+ "Ġf l",
1422
+ "d d",
1423
+ "Ġtoo k",
1424
+ "Ġsc ared",
1425
+ "Ġtoy s",
1426
+ "k ing",
1427
+ "Ġle arn",
1428
+ "n y",
1429
+ "Ġp r",
1430
+ "Ġbo x",
1431
+ "u re",
1432
+ "Ġw ill",
1433
+ "i f",
1434
+ "re t",
1435
+ "Ġ You",
1436
+ "a b",
1437
+ "i ck",
1438
+ "e p",
1439
+ "Ġth ings",
1440
+ "Ġm y",
1441
+ "Ġyou r",
1442
+ "Ġa round",
1443
+ "Ġb l",
1444
+ "Ġli ved",
1445
+ "ou d",
1446
+ "is h",
1447
+ "u ck",
1448
+ "Ġw hen",
1449
+ "Ġs un",
1450
+ ", \"",
1451
+ "Ġf e",
1452
+ "Ġthe n",
1453
+ "a s",
1454
+ "Ġs w",
1455
+ "Ġc h",
1456
+ "u s",
1457
+ "pp ed",
1458
+ "ĠM ia",
1459
+ "Ġa b",
1460
+ "an k",
1461
+ "T im",
1462
+ "uc y",
1463
+ "um p",
1464
+ "Ġg et",
1465
+ "Ġl ot",
1466
+ "T h",
1467
+ "is t",
1468
+ "ot h",
1469
+ "Ġt ried",
1470
+ "a p",
1471
+ "Ġkn ow",
1472
+ "Ġg ot",
1473
+ "Ġsay s",
1474
+ "Ġkn e",
1475
+ "Ġwh o",
1476
+ "Ġman y",
1477
+ "it ed",
1478
+ "u st",
1479
+ "nd er",
1480
+ "Ġin t",
1481
+ "Ġab out",
1482
+ "L ily",
1483
+ "Ġp ret",
1484
+ "Ġr ed",
1485
+ "Ġan y",
1486
+ "Ġd ec",
1487
+ "i ve",
1488
+ "Ġ D",
1489
+ "Ġkne w",
1490
+ "a ce",
1491
+ "Ġm ore",
1492
+ "ou s",
1493
+ "is e",
1494
+ "Ġp ic",
1495
+ "a u",
1496
+ "Ġc are",
1497
+ "Ġ v",
1498
+ "Ġlearn ed",
1499
+ "all y",
1500
+ "ĠL ucy",
1501
+ "Ġbec ame",
1502
+ "q u",
1503
+ "Ġwa ter",
1504
+ "Ġh ug",
1505
+ "f ter",
1506
+ "Ġbe st",
1507
+ "Ġp o",
1508
+ "au se",
1509
+ "Ġg re",
1510
+ "Ġo p",
1511
+ "way s",
1512
+ "ur p",
1513
+ "Ġlaug hed",
1514
+ "Ġout side",
1515
+ "Ġal ways",
1516
+ "Ġex c",
1517
+ "Ġu n",
1518
+ "Ġloo k",
1519
+ "ĠB ob",
1520
+ "Ġbec ause",
1521
+ "Ġsh ow",
1522
+ "Ġdec id",
1523
+ "Ġro om",
1524
+ "an t",
1525
+ "ĠS o",
1526
+ "Ġe at",
1527
+ "f e",
1528
+ "Ġh o",
1529
+ "Ġdecid ed",
1530
+ "Ġint o",
1531
+ "Ġj ump",
1532
+ "it e",
1533
+ "ĠA nd",
1534
+ "Ġb oth",
1535
+ "Ġp e",
1536
+ "er s",
1537
+ "ĠM om",
1538
+ "Ġd ad",
1539
+ "The y",
1540
+ "Ġ ke",
1541
+ "u dd",
1542
+ "Ġon e",
1543
+ "Ġfa st",
1544
+ "Ġn ice",
1545
+ "n n",
1546
+ "Ġr un",
1547
+ "Ġth is",
1548
+ "T om",
1549
+ "Y es",
1550
+ "Ġl ong",
1551
+ "Ġfe el",
1552
+ "Ġexc ited",
1553
+ "Ġ E",
1554
+ "Ġto ld",
1555
+ "Ġs k",
1556
+ "Ġa m",
1557
+ "urp r",
1558
+ "Ġin side",
1559
+ "Ġt r",
1560
+ "u ll",
1561
+ "ou r",
1562
+ "Ġs urpr",
1563
+ "Ġpret ty",
1564
+ "Ġm o",
1565
+ "in y",
1566
+ "in k",
1567
+ "Ġs or",
1568
+ "W h",
1569
+ "Ġt ake",
1570
+ "o g",
1571
+ "Ġg ave",
1572
+ "le w",
1573
+ "Ġro ck",
1574
+ "Ġs l",
1575
+ "Ġe ach",
1576
+ "Ġmu ch",
1577
+ "Ġst r",
1578
+ "im al",
1579
+ "Ġh ow",
1580
+ "Ġg ra",
1581
+ "Ġan imal",
1582
+ "nn a",
1583
+ "ar a",
1584
+ "Ġne ed",
1585
+ "g ed",
1586
+ "Ġto w",
1587
+ "et ter",
1588
+ "Ġth an",
1589
+ "B ut",
1590
+ "v en",
1591
+ "Ġ or",
1592
+ "Ġu nder",
1593
+ "Ġ C",
1594
+ "es s",
1595
+ "Ġsor ry",
1596
+ "Ġo ld",
1597
+ "is ed",
1598
+ "g e",
1599
+ "r o",
1600
+ "ur t",
1601
+ "Ġf ish",
1602
+ "Ġc le",
1603
+ "Ġwal ked",
1604
+ "Ġbe ar",
1605
+ "a nd",
1606
+ "Ġcl o",
1607
+ "a se",
1608
+ "a st",
1609
+ "Ġha nd",
1610
+ "ur n",
1611
+ "Ġk ind",
1612
+ "ĠH is",
1613
+ "ĠW e",
1614
+ "Ġhapp ened",
1615
+ "Ġfl ow",
1616
+ "Ġf ood",
1617
+ "he re",
1618
+ "Ġl ist",
1619
+ "Ġh ig",
1620
+ "Ġanimal s",
1621
+ "Ġj ust",
1622
+ "Ġt e",
1623
+ "Ġdid n",
1624
+ "Ġne ar",
1625
+ "Ġ ide",
1626
+ "Ġsk y",
1627
+ "Ġwa t",
1628
+ "r om",
1629
+ "Ġt ry",
1630
+ "in e",
1631
+ "Ġs n",
1632
+ "Ġf i",
1633
+ "c hed",
1634
+ "ĠA my",
1635
+ "v ing",
1636
+ "Ġb ug",
1637
+ "Ġide a",
1638
+ "Ġb etter",
1639
+ "Ġu s",
1640
+ "p l",
1641
+ "g ry",
1642
+ "Ġit s",
1643
+ "p ec",
1644
+ "Ġhe ard",
1645
+ "Ġt w",
1646
+ "Ġl et",
1647
+ "f f",
1648
+ "ab le",
1649
+ "at e",
1650
+ "Ġsh are",
1651
+ "Ġcare ful",
1652
+ "Th ank",
1653
+ "Ġ en",
1654
+ "m ore",
1655
+ "Ġany more",
1656
+ "Ġf ly",
1657
+ "Ġf lew",
1658
+ "Ġst or",
1659
+ "Ġ if",
1660
+ "M om",
1661
+ "i al",
1662
+ "Ġlot s",
1663
+ "ĠT h",
1664
+ "Ġc om",
1665
+ "Ġsp ec",
1666
+ "Ġd an",
1667
+ "Ġspec ial",
1668
+ "i on",
1669
+ "Ġb y",
1670
+ "Ġne ver",
1671
+ "re am",
1672
+ "l f",
1673
+ "Ġw ind",
1674
+ "Ġb u",
1675
+ "Ġcle an",
1676
+ "Ġf o",
1677
+ "Ġt al",
1678
+ "Ġd on",
1679
+ "Ġg r",
1680
+ "or t",
1681
+ "r m",
1682
+ "Ġ end",
1683
+ "op le",
1684
+ "Ġlo ve",
1685
+ "ĠThe n",
1686
+ "Ġe ven",
1687
+ "b er",
1688
+ "Ġm ag",
1689
+ "Ġsh iny",
1690
+ "Ġh ard",
1691
+ "Ġc ake",
1692
+ "Ġf ore",
1693
+ "Ġo ver",
1694
+ "a k",
1695
+ "Ġco l",
1696
+ "Ġb ook",
1697
+ "udd en",
1698
+ "Ġt urn",
1699
+ "Ġsa fe",
1700
+ "Ġf am",
1701
+ "Ġa fter",
1702
+ "Ġb ad",
1703
+ "pec ted",
1704
+ "Ġpe ople",
1705
+ "Ġsurpr ised",
1706
+ "Ġhig h",
1707
+ "Ġpr oud",
1708
+ "ad y",
1709
+ "ĠA nna",
1710
+ "Ġh urt",
1711
+ "im b",
1712
+ "ĠE very",
1713
+ "L et",
1714
+ "ex pected",
1715
+ "Ġun expected",
1716
+ "udden ly",
1717
+ "Ġpic ked",
1718
+ "Ġg round",
1719
+ "Ġc u",
1720
+ "Ġcl imb",
1721
+ "Ġdo or",
1722
+ "Ġc ome",
1723
+ "ard en",
1724
+ "Ġg arden",
1725
+ "Ġop ened",
1726
+ "b le",
1727
+ "Ġl oud",
1728
+ "A s",
1729
+ "Ġg l",
1730
+ "Ġc he",
1731
+ "Ġ im",
1732
+ "il d",
1733
+ "' m",
1734
+ "Ġg ive",
1735
+ "a il",
1736
+ "Ġcol or",
1737
+ "Ġbl ue",
1738
+ "Ġwa y",
1739
+ "Ġe ver",
1740
+ "Ġthan ked",
1741
+ "ĠF rom",
1742
+ "Ġst ill",
1743
+ "Ġf ar",
1744
+ "Ġhug ged",
1745
+ "ĠH er",
1746
+ "i p",
1747
+ "ĠW hen",
1748
+ "Ġc all",
1749
+ "N o",
1750
+ "Ġmag ic",
1751
+ "ĠS ara",
1752
+ "um my",
1753
+ "Ġ K",
1754
+ "ag e",
1755
+ "Ġof f",
1756
+ "i z",
1757
+ "Ġjump ed",
1758
+ "oug h",
1759
+ "Ġp ar",
1760
+ "Ġfam ily",
1761
+ "Ġsh ould",
1762
+ "Ġk id",
1763
+ "oo l",
1764
+ "u ff",
1765
+ "Ġsm ile",
1766
+ "he s",
1767
+ "Ġpl ace",
1768
+ "ĠI n",
1769
+ "k ay",
1770
+ "Ġwal k",
1771
+ "Ġgre at",
1772
+ "Ġn ow",
1773
+ "Ġstr ong",
1774
+ "c t",
1775
+ "e m",
1776
+ "Ġst ay",
1777
+ "itt y",
1778
+ "t ure",
1779
+ "Ġ qu",
1780
+ "Ġfore st",
1781
+ "Ġu nt",
1782
+ "Ġst o",
1783
+ "a ut",
1784
+ "an e",
1785
+ "Ġbr o",
1786
+ "Ġb ra",
1787
+ "o on",
1788
+ "Ġs qu",
1789
+ "Ġ P",
1790
+ "Ġbo at",
1791
+ "Ġst ick",
1792
+ "Ġunt il",
1793
+ "Ġfr og",
1794
+ "Ġbe aut",
1795
+ "d y",
1796
+ "Ġn ext",
1797
+ "le ase",
1798
+ "Ġhapp ily",
1799
+ "n ing",
1800
+ "Ġlist en",
1801
+ "Ġkid s",
1802
+ "Ġt ra",
1803
+ "Ġhelp ed",
1804
+ "a king",
1805
+ "Ġa pp",
1806
+ "i ful",
1807
+ "Ġbeaut iful",
1808
+ "Ġshow ed",
1809
+ "t h",
1810
+ "i es",
1811
+ "Ġd ra",
1812
+ "Ġstor y",
1813
+ "un ny",
1814
+ "Ġtow n",
1815
+ "b y",
1816
+ "Ġim p",
1817
+ "re l",
1818
+ "Ġwh ile",
1819
+ "Ġclo s",
1820
+ "b e",
1821
+ "o y",
1822
+ "Ġr ain",
1823
+ "Ġpic ture",
1824
+ "re ss",
1825
+ "p t",
1826
+ "Ġbe ing",
1827
+ "Ġevery one",
1828
+ "Ġre m",
1829
+ "Ġha t",
1830
+ "Ġm or"
1831
+ ]
1832
+ }
1833
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<|endoftext|>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ }
11
+ },
12
+ "bos_token": "<|endoftext|>",
13
+ "clean_up_tokenization_spaces": true,
14
+ "eos_token": "<|endoftext|>",
15
+ "model_max_length": 1000000000000000019884624838656,
16
+ "tokenizer_class": "PreTrainedTokenizerFast",
17
+ "unk_token": "<|endoftext|>"
18
+ }