--- license: gpl-3.0 --- # Testing model You can test the model on https://huggingface.co/spaces/nvomai/nvom-phi-3.5-mini-3b q4_k_m to more optimize! Phi 3.5 mini by Microsoft and Optimized by NVOM.ai Benchmarks: NVOM.ai AIBench SmartBench Specials (Based on NVOM.ai bench) NVOM Preview 4b 7.91 10.12 9.49 Speed, Smart, Quanatisation Phi 3.5 mini 5.31 4.95 6.31 Smart, Speed Gemma 2 9b (original) 3.76 2.11 2.93 Quanatisation, Smart 1-4 - bad 5-8 - normal 8-12 - best ### Loading the model locally After obtaining the Phi-3.5-mini-instruct model checkpoint, users can use this sample code for inference. ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline torch.random.manual_seed(0) model = AutoModelForCausalLM.from_pretrained( "nvomai/nvom-preview-4b", device_map="cuda", torch_dtype="auto", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3.5-mini-instruct") messages = [ {"role": "system", "content": "You are a helpful AI assistant."}, {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"}, {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."}, {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"}, ] pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, ) generation_args = { "max_new_tokens": 500, "return_full_text": False, "temperature": 0.0, "do_sample": False, } output = pipe(messages, **generation_args) print(output[0]['generated_text']) ```