andrijdavid
commited on
Commit
•
81179f4
1
Parent(s):
ecc8bf9
Update README.md
Browse files
README.md
CHANGED
@@ -29,12 +29,14 @@ Solidity Llama 3 is a Large Language Model specifically designed for Solidity co
|
|
29 |
### Direct Use
|
30 |
|
31 |
Solidity Llama 3 can be used for code completion and infilling tasks within Solidity code editors. It was trained for this task using the fill-in-the-middle (FIM) objective, where you provide a prefix and a suffix as context for the completion. The following tokens are used to separate the different parts of the input:
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
|
36 |
|
37 |
```python
|
|
|
|
|
38 |
FIM_SUFFIX = "<|reserved_special_token_10|>"
|
39 |
FIM_PREFIX = "<|reserved_special_token_11|>"
|
40 |
FIM_MIDDLE = "<|reserved_special_token_12|>"
|
@@ -67,6 +69,22 @@ print(tokenizer.decode(outputs[0][prompt_len:]))
|
|
67 |
|
68 |
```
|
69 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
70 |
### Out-of-Scope Use
|
71 |
|
72 |
The model may not perform well for tasks outside of Solidity code completion and infilling, and users should be aware of its limitations in these areas.
|
|
|
29 |
### Direct Use
|
30 |
|
31 |
Solidity Llama 3 can be used for code completion and infilling tasks within Solidity code editors. It was trained for this task using the fill-in-the-middle (FIM) objective, where you provide a prefix and a suffix as context for the completion. The following tokens are used to separate the different parts of the input:
|
32 |
+
- <|reserved_special_token_11|> precedes the context before the completion we want to run.
|
33 |
+
- <|reserved_special_token_10|> precedes the suffix. You must put this token exactly where the cursor would be positioned in an editor, as this is the location that will be completed by the model.
|
34 |
+
- <|reserved_special_token_12|> is the prompt that invites the model to run the generation.
|
35 |
|
36 |
|
37 |
```python
|
38 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
39 |
+
|
40 |
FIM_SUFFIX = "<|reserved_special_token_10|>"
|
41 |
FIM_PREFIX = "<|reserved_special_token_11|>"
|
42 |
FIM_MIDDLE = "<|reserved_special_token_12|>"
|
|
|
69 |
|
70 |
```
|
71 |
|
72 |
+
You can provide a list of terminators to the generate function, like this:
|
73 |
+
|
74 |
+
```python
|
75 |
+
|
76 |
+
terminators = tokenizer.convert_tokens_to_ids([FIM_PREFIX, FIM_MIDDLE, FIM_SUFFIX])
|
77 |
+
terminators += [tokenizer.eos_token_id]
|
78 |
+
|
79 |
+
outputs = model.generate(
|
80 |
+
**inputs,
|
81 |
+
max_new_tokens=1024,
|
82 |
+
eos_token_id=terminators,
|
83 |
+
)
|
84 |
+
print(tokenizer.decode(outputs[0][prompt_len:]))
|
85 |
+
|
86 |
+
```
|
87 |
+
|
88 |
### Out-of-Scope Use
|
89 |
|
90 |
The model may not perform well for tasks outside of Solidity code completion and infilling, and users should be aware of its limitations in these areas.
|