|
--- |
|
license: cc-by-sa-3.0 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- csharp |
|
- mpt |
|
- instruct |
|
- 1b |
|
- llm |
|
- .net |
|
--- |
|
Upsides: |
|
- similar in quality (slightly worse) for C# code generation and explanation as 7b [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S), |
|
- 1b params size (2.6gb, bfloat16 finetuned), |
|
- 6x smaller, |
|
- 4x+ faster |
|
|
|
|
|
Downsides: |
|
- Sometimes, sufferers from response repetition-reiteration-not-ending when outputting for general discussion questions |
|
- Slightly worse in code generation than 7b model |
|
- No GGML/LLAMA.cpp running on CPU support yet |
|
|
|
Based on [mosaicml/mpt-1b-redpajama-200b-dolly](https://huggingface.co/mosaicml/mpt-1b-redpajama-200b-dolly) |
|
|
|
Same data sources as in [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S) |
|
|
|
Usage example: |
|
```python |
|
import os |
|
from glob import glob |
|
import torch |
|
import transformers |
|
from transformers import PreTrainedTokenizerFast |
|
from transformers import AutoTokenizer |
|
|
|
out_name = "Nethermind/Mpt-Instruct-DotNet-XS" |
|
model = transformers.AutoModelForCausalLM.from_pretrained( |
|
out_name, |
|
torch_dtype=torch.bfloat16, |
|
trust_remote_code=True, |
|
) |
|
model.to('cuda:0') |
|
model.eval() |
|
|
|
from markdownify import markdownify as md |
|
from bs4 import BeautifulSoup |
|
from IPython.display import display, Markdown |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b") |
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
|
INSTRUCTION_KEY = "### Instruction:" |
|
RESPONSE_KEY = "### Response:" |
|
PROMPT_FOR_GENERATION_FORMAT = """{system} |
|
{instruction_key} |
|
{instruction} |
|
{response_key} |
|
""".format( |
|
system="{system}", |
|
instruction_key=INSTRUCTION_KEY, |
|
instruction="{instruction}", |
|
response_key=RESPONSE_KEY, |
|
response="{response}" |
|
) |
|
|
|
|
|
def output_loop(input_tokens, steps=2000): |
|
print(input_tokens.shape[1], 2000 - input_tokens.shape[1] ) |
|
input_tokens = model.generate(input_tokens.to('cuda:0'), max_new_tokens=min(512, 1024 - input_tokens.shape[1]), do_sample=False, top_k=1, top_p=0.95) |
|
return input_tokens |
|
|
|
|
|
def give_answer(instruction="Create a loop over [0, 6, 7 , 77] that prints its contentrs", system="Below is an instruction that describes a task. Write a response that appropriately completes the request.", ): |
|
question = PROMPT_FOR_GENERATION_FORMAT.format(system=system, instruction=instruction) |
|
tokenized_question = tokenizer.encode(question ,return_tensors='pt') |
|
outputs = output_loop(tokenized_question) |
|
answer = tokenizer.batch_decode(outputs, skip_special_tokens=True) |
|
print(answer) |
|
return answer |
|
|
|
give_answer("What is the main difference between a struct and a class in C#?") |
|
``` |
|
|
|
outputs: |
|
``` |
|
A struct is a value type, which means it can only hold a few values. It is often used as a placeholder for other data types. A class, on the other hand, is a reference type, which means it can hold references to other data types. |
|
``` |
|
|
|
On RTX 4090 new token sizes: |
|
- 2sec for 128 tokens |
|
- 5sec for 256 tokens |
|
- 11sec for 512 tokens |
|
|
|
Code generation: |
|
prompt: |
|
> Generate code to answer the question. |
|
> |
|
> How would you retrieve and analyse the fee history for the last 100 blocks and determine the average gas price? |
|
|
|
Example of code output: |
|
```csharp |
|
public async Task<decimal> GetFeeHistoryGasPriceAverage() |
|
{ |
|
// Get the fee history |
|
ResultWrapper<FeeHistoryResults> result = await _ethRpc.eth_feeHistory(100, BlockParameter.Latest, |
|
new double[] { 50, 75, 90 }); |
|
// Check if the API call succeeded |
|
if (result.Result!= Result.Success) |
|
{ |
|
throw new Exception("Failed to retrieve fee history"); |
|
} |
|
// Get the gas price average |
|
decimal averageGasPrice = result.Data.BaseFeePerGas.Average(); |
|
|
|
return averageGasPrice; |
|
} |
|
``` |