Nethermind
/

Mpt-Instruct-DotNet-XS

@@ -1,9 +1,120 @@
 ---
 license: cc-by-sa-3.0
 ---
-1b size, 6x smaller, 4x+ faster, similar quality for C# code generation and explanation as 7b [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S).
 Based on [mosaicml/mpt-1b-redpajama-200b-dolly](https://huggingface.co/mosaicml/mpt-1b-redpajama-200b-dolly)
-Same data sources as in [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S)

 ---
 license: cc-by-sa-3.0
+language:
+- en
+pipeline_tag: text-generation
+tags:
+- csharp
+- mpt
+- instruct
+- 1b
+- llm
+- .net
 ---
+Upsides:
+ - similar in quality (slightly worse) for C# code generation and explanation as 7b [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S),
+ - 1b params size (2.6gb, bfloat16 finetuned),
+ - 6x smaller,
+ - 4x+ faster
+Downsides:
+ - Sometimes, sufferers from response repetition-reiteration-not-ending when outputting for general discussion questions
+ - Slightly worse in code generation than 7b model
+ - No GGML/LLAMA.cpp running on CPU support yet
 Based on [mosaicml/mpt-1b-redpajama-200b-dolly](https://huggingface.co/mosaicml/mpt-1b-redpajama-200b-dolly)
+Same data sources as in [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S)
+Usage example:
+```python
+import os
+from glob import glob
+import torch
+import transformers
+from transformers import PreTrainedTokenizerFast
+from transformers import AutoTokenizer
+out_name = "Nethermind/Mpt-Instruct-DotNet-XS"
+model = transformers.AutoModelForCausalLM.from_pretrained(
+    out_name,
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True,
+)
+model.to('cuda:0')
+model.eval()
+from markdownify import markdownify as md
+from bs4 import BeautifulSoup
+from IPython.display import display, Markdown
+tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
+tokenizer.pad_token = tokenizer.eos_token
+INSTRUCTION_KEY = "### Instruction:"
+RESPONSE_KEY = "### Response:"
+PROMPT_FOR_GENERATION_FORMAT = """{system}
+{instruction_key}
+{instruction}
+{response_key}
+""".format(
+    system="{system}",
+    instruction_key=INSTRUCTION_KEY,
+    instruction="{instruction}",
+    response_key=RESPONSE_KEY,
+    response="{response}"
+)
+def output_loop(input_tokens, steps=2000):
+    print(input_tokens.shape[1], 2000 - input_tokens.shape[1] )
+    input_tokens = model.generate(input_tokens.to('cuda:0'), max_new_tokens=min(512, 1024 - input_tokens.shape[1]), do_sample=False, top_k=1, top_p=0.95)
+    return input_tokens
+def give_answer(instruction="Create a loop over [0, 6, 7 , 77] that prints its contentrs", system="Below is an instruction that describes a task. Write a response that appropriately completes the request.", ):
+    question = PROMPT_FOR_GENERATION_FORMAT.format(system=system, instruction=instruction)
+    tokenized_question = tokenizer.encode(question ,return_tensors='pt')
+    outputs = output_loop(tokenized_question)
+    answer = tokenizer.batch_decode(outputs, skip_special_tokens=True)
+    print(answer)
+    return answer
+give_answer("What is the main difference between a struct and a class in C#?")
+```
+outputs:
+```
+A struct is a value type, which means it can only hold a few values. It is often used as a placeholder for other data types. A class, on the other hand, is a reference type, which means it can hold references to other data types.
+```
+On RTX 4090 new token sizes:
+ - 2sec for 128 tokens
+ - 5sec for 256 tokens
+ - 11sec for 512 tokens
+Code generation:
+prompt:
+> Generate code to answer the question.
+>
+> How would you retrieve and analyse the fee history for the last 100 blocks and determine the average gas price?
+Example of code output:
+```csharp
+public async Task<decimal> GetFeeHistoryGasPriceAverage()
+{
+  // Get the fee history
+  ResultWrapper<FeeHistoryResults> result = await _ethRpc.eth_feeHistory(100, BlockParameter.Latest,
+      new double[] { 50, 75, 90 });
+  // Check if the API call succeeded
+  if (result.Result!= Result.Success)
+  {
+     throw new Exception("Failed to retrieve fee history");
+  }
+  // Get the gas price average
+  decimal averageGasPrice = result.Data.BaseFeePerGas.Average();
+  return averageGasPrice;
+}
+```