Usage example
Browse files
README.md
CHANGED
@@ -1,9 +1,120 @@
|
|
1 |
---
|
2 |
license: cc-by-sa-3.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
|
5 |
-
|
|
|
|
|
|
|
6 |
|
7 |
Based on [mosaicml/mpt-1b-redpajama-200b-dolly](https://huggingface.co/mosaicml/mpt-1b-redpajama-200b-dolly)
|
8 |
|
9 |
-
Same data sources as in [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-sa-3.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
pipeline_tag: text-generation
|
6 |
+
tags:
|
7 |
+
- csharp
|
8 |
+
- mpt
|
9 |
+
- instruct
|
10 |
+
- 1b
|
11 |
+
- llm
|
12 |
+
- .net
|
13 |
---
|
14 |
+
Upsides:
|
15 |
+
- similar in quality (slightly worse) for C# code generation and explanation as 7b [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S),
|
16 |
+
- 1b params size (2.6gb, bfloat16 finetuned),
|
17 |
+
- 6x smaller,
|
18 |
+
- 4x+ faster
|
19 |
+
|
20 |
|
21 |
+
Downsides:
|
22 |
+
- Sometimes, sufferers from response repetition-reiteration-not-ending when outputting for general discussion questions
|
23 |
+
- Slightly worse in code generation than 7b model
|
24 |
+
- No GGML/LLAMA.cpp running on CPU support yet
|
25 |
|
26 |
Based on [mosaicml/mpt-1b-redpajama-200b-dolly](https://huggingface.co/mosaicml/mpt-1b-redpajama-200b-dolly)
|
27 |
|
28 |
+
Same data sources as in [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S)
|
29 |
+
|
30 |
+
Usage example:
|
31 |
+
```python
|
32 |
+
import os
|
33 |
+
from glob import glob
|
34 |
+
import torch
|
35 |
+
import transformers
|
36 |
+
from transformers import PreTrainedTokenizerFast
|
37 |
+
from transformers import AutoTokenizer
|
38 |
+
|
39 |
+
out_name = "Nethermind/Mpt-Instruct-DotNet-XS"
|
40 |
+
model = transformers.AutoModelForCausalLM.from_pretrained(
|
41 |
+
out_name,
|
42 |
+
torch_dtype=torch.bfloat16,
|
43 |
+
trust_remote_code=True,
|
44 |
+
)
|
45 |
+
model.to('cuda:0')
|
46 |
+
model.eval()
|
47 |
+
|
48 |
+
from markdownify import markdownify as md
|
49 |
+
from bs4 import BeautifulSoup
|
50 |
+
from IPython.display import display, Markdown
|
51 |
+
|
52 |
+
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
|
53 |
+
tokenizer.pad_token = tokenizer.eos_token
|
54 |
+
|
55 |
+
INSTRUCTION_KEY = "### Instruction:"
|
56 |
+
RESPONSE_KEY = "### Response:"
|
57 |
+
PROMPT_FOR_GENERATION_FORMAT = """{system}
|
58 |
+
{instruction_key}
|
59 |
+
{instruction}
|
60 |
+
{response_key}
|
61 |
+
""".format(
|
62 |
+
system="{system}",
|
63 |
+
instruction_key=INSTRUCTION_KEY,
|
64 |
+
instruction="{instruction}",
|
65 |
+
response_key=RESPONSE_KEY,
|
66 |
+
response="{response}"
|
67 |
+
)
|
68 |
+
|
69 |
+
|
70 |
+
def output_loop(input_tokens, steps=2000):
|
71 |
+
print(input_tokens.shape[1], 2000 - input_tokens.shape[1] )
|
72 |
+
input_tokens = model.generate(input_tokens.to('cuda:0'), max_new_tokens=min(512, 1024 - input_tokens.shape[1]), do_sample=False, top_k=1, top_p=0.95)
|
73 |
+
return input_tokens
|
74 |
+
|
75 |
+
|
76 |
+
def give_answer(instruction="Create a loop over [0, 6, 7 , 77] that prints its contentrs", system="Below is an instruction that describes a task. Write a response that appropriately completes the request.", ):
|
77 |
+
question = PROMPT_FOR_GENERATION_FORMAT.format(system=system, instruction=instruction)
|
78 |
+
tokenized_question = tokenizer.encode(question ,return_tensors='pt')
|
79 |
+
outputs = output_loop(tokenized_question)
|
80 |
+
answer = tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
81 |
+
print(answer)
|
82 |
+
return answer
|
83 |
+
|
84 |
+
give_answer("What is the main difference between a struct and a class in C#?")
|
85 |
+
```
|
86 |
+
|
87 |
+
outputs:
|
88 |
+
```
|
89 |
+
A struct is a value type, which means it can only hold a few values. It is often used as a placeholder for other data types. A class, on the other hand, is a reference type, which means it can hold references to other data types.
|
90 |
+
```
|
91 |
+
|
92 |
+
On RTX 4090 new token sizes:
|
93 |
+
- 2sec for 128 tokens
|
94 |
+
- 5sec for 256 tokens
|
95 |
+
- 11sec for 512 tokens
|
96 |
+
|
97 |
+
Code generation:
|
98 |
+
prompt:
|
99 |
+
> Generate code to answer the question.
|
100 |
+
>
|
101 |
+
> How would you retrieve and analyse the fee history for the last 100 blocks and determine the average gas price?
|
102 |
+
|
103 |
+
Example of code output:
|
104 |
+
```csharp
|
105 |
+
public async Task<decimal> GetFeeHistoryGasPriceAverage()
|
106 |
+
{
|
107 |
+
// Get the fee history
|
108 |
+
ResultWrapper<FeeHistoryResults> result = await _ethRpc.eth_feeHistory(100, BlockParameter.Latest,
|
109 |
+
new double[] { 50, 75, 90 });
|
110 |
+
// Check if the API call succeeded
|
111 |
+
if (result.Result!= Result.Success)
|
112 |
+
{
|
113 |
+
throw new Exception("Failed to retrieve fee history");
|
114 |
+
}
|
115 |
+
// Get the gas price average
|
116 |
+
decimal averageGasPrice = result.Data.BaseFeePerGas.Average();
|
117 |
+
|
118 |
+
return averageGasPrice;
|
119 |
+
}
|
120 |
+
```
|