File size: 3,784 Bytes
eaf1cef 59929e0 eaf1cef 59929e0 730e4ac 59929e0 730e4ac 59929e0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
license: cc-by-sa-3.0
language:
- en
pipeline_tag: text-generation
tags:
- csharp
- mpt
- instruct
- 1b
- llm
- .net
---
Upsides:
- similar in quality (slightly worse) for C# code generation and explanation as 7b [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S),
- 1b params size (2.6gb, bfloat16 finetuned),
- 6x smaller,
- 4x+ faster
Downsides:
- Sometimes, sufferers from response repetition-reiteration-not-ending when outputting for general discussion questions
- Slightly worse in code generation than 7b model
- No GGML/LLAMA.cpp running on CPU support yet
Based on [mosaicml/mpt-1b-redpajama-200b-dolly](https://huggingface.co/mosaicml/mpt-1b-redpajama-200b-dolly)
Same data sources as in [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S)
Usage example:
```python
import os
from glob import glob
import torch
import transformers
from transformers import PreTrainedTokenizerFast
from transformers import AutoTokenizer
out_name = "Nethermind/Mpt-Instruct-DotNet-XS"
model = transformers.AutoModelForCausalLM.from_pretrained(
out_name,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
)
model.to('cuda:0')
model.eval()
from markdownify import markdownify as md
from bs4 import BeautifulSoup
from IPython.display import display, Markdown
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
tokenizer.pad_token = tokenizer.eos_token
INSTRUCTION_KEY = "### Instruction:"
RESPONSE_KEY = "### Response:"
PROMPT_FOR_GENERATION_FORMAT = """{system}
{instruction_key}
{instruction}
{response_key}
""".format(
system="{system}",
instruction_key=INSTRUCTION_KEY,
instruction="{instruction}",
response_key=RESPONSE_KEY,
response="{response}"
)
def output_loop(input_tokens, steps=2000):
print(input_tokens.shape[1], 2000 - input_tokens.shape[1] )
input_tokens = model.generate(input_tokens.to('cuda:0'), max_new_tokens=min(512, 1024 - input_tokens.shape[1]), do_sample=False, top_k=1, top_p=0.95)
return input_tokens
def give_answer(instruction="Create a loop over [0, 6, 7 , 77] that prints its contentrs", system="Below is an instruction that describes a task. Write a response that appropriately completes the request.", ):
question = PROMPT_FOR_GENERATION_FORMAT.format(system=system, instruction=instruction)
tokenized_question = tokenizer.encode(question ,return_tensors='pt')
outputs = output_loop(tokenized_question)
answer = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(answer)
return answer
give_answer("What is the main difference between a struct and a class in C#?")
```
outputs:
```
A struct is a value type, which means it can only hold a few values. It is often used as a placeholder for other data types. A class, on the other hand, is a reference type, which means it can hold references to other data types.
```
On RTX 4090 new token sizes:
- 2sec for 128 tokens
- 5sec for 256 tokens
- 11sec for 512 tokens
Code generation:
prompt:
> Generate code to answer the question.
>
> How would you retrieve and analyse the fee history for the last 100 blocks and determine the average gas price?
Example of code output:
```csharp
public async Task<decimal> GetFeeHistoryGasPriceAverage()
{
// Get the fee history
ResultWrapper<FeeHistoryResults> result = await _ethRpc.eth_feeHistory(100, BlockParameter.Latest,
new double[] { 50, 75, 90 });
// Check if the API call succeeded
if (result.Result!= Result.Success)
{
throw new Exception("Failed to retrieve fee history");
}
// Get the gas price average
decimal averageGasPrice = result.Data.BaseFeePerGas.Average();
return averageGasPrice;
}
``` |