AmberChat - DeepSparse

This repo contains model files for AmberChat optimized for DeepSparse, a CPU inference runtime for sparse models.

This model was quantized and pruned with SparseGPT, using SparseML.

Inference

Install DeepSparse LLM for fast inference on CPUs:

pip install deepsparse-nightly[llm]

Run in a Python pipeline:

from deepsparse import TextGeneration

template= "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n### Human: Got any creative ideas for a 10 year old’s birthday?\n### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party:\n1. Treasure Hunt: Organize a treasure hunt in your backyard or nearby park. Create clues and riddles for the kids to solve, leading them to hidden treasures and surprises.\n2. Science Party: Plan a science-themed party where kids can engage in fun and interactive experiments. You can set up different stations with activities like making slime, erupting volcanoes, or creating simple chemical reactions.\n3. Outdoor Movie Night: Set up a backyard movie night with a projector and a large screen or white sheet. Create a cozy seating area with blankets and pillows, and serve popcorn and snacks while the kids enjoy a favorite movie under the stars.\n4. DIY Crafts Party: Arrange a craft party where kids can unleash their creativity. Provide a variety of craft supplies like beads, paints, and fabrics, and let them create their own unique masterpieces to take home as party favors.\n5. Sports Olympics: Host a mini Olympics event with various sports and games. Set up different stations for activities like sack races, relay races, basketball shooting, and obstacle courses. Give out medals or certificates to the participants.\n6. Cooking Party: Have a cooking-themed party where the kids can prepare their own mini pizzas, cupcakes, or cookies. Provide toppings, frosting, and decorating supplies, and let them get hands-on in the kitchen.\n7. Superhero Training Camp: Create a superhero-themed party where the kids can engage in fun training activities. Set up an obstacle course, have them design their own superhero capes or masks, and organize superhero-themed games and challenges.\n8. Outdoor Adventure: Plan an outdoor adventure party at a local park or nature reserve. Arrange activities like hiking, nature scavenger hunts, or a picnic with games. Encourage exploration and appreciation for the outdoors.\nRemember to tailor the activities to the birthday child's interests and preferences. Have a great celebration!\n### Human: {prompt}\n### Assistant:"

prompt = "How to get into a good university?"

input_str = template.format(prompt=prompt)

model = TextGeneration(model_path="hf:nm-testing/AmberChat-pruned50-quant")

print(model(input_str, max_new_tokens=200).generations[0].text)
"""
Here are some steps to get into a good university:
1. Research universities: Research universities that align with your interests and goals. Look into their programs, faculty, and reputation.
2. Apply early: Apply for universities early in the year. Many universities have early deadlines for admission.
3. Show evidence of academic excellence: Submit transcripts from your high school or college, and any other relevant academic documents.
4. Show evidence of extracurricular activities: Show evidence of extracurricular activities, such as volunteering, extracurricular activities, or internships.
5. Show evidence of leadership and extracurricular activities: Show evidence of leadership and extracurricular activities, such as extracurricular activities, volunteering, or leadership roles.

Show evidence of academic excellence: Submit transcripts from your high school or college, and any other relevant academic documents.
"""

Example 2

from deepsparse import TextGeneration
generation_config = {
    "repetition_penalty": 2.0,
    "do_sample": True,
    "max_new_tokens": 500,
}
template= "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n### Human: Got any creative ideas for a 10 year old’s birthday?\n### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party:\n1. Treasure Hunt: Organize a treasure hunt in your backyard or nearby park. Create clues and riddles for the kids to solve, leading them to hidden treasures and surprises.\n2. Science Party: Plan a science-themed party where kids can engage in fun and interactive experiments. You can set up different stations with activities like making slime, erupting volcanoes, or creating simple chemical reactions.\n3. Outdoor Movie Night: Set up a backyard movie night with a projector and a large screen or white sheet. Create a cozy seating area with blankets and pillows, and serve popcorn and snacks while the kids enjoy a favorite movie under the stars.\n4. DIY Crafts Party: Arrange a craft party where kids can unleash their creativity. Provide a variety of craft supplies like beads, paints, and fabrics, and let them create their own unique masterpieces to take home as party favors.\n5. Sports Olympics: Host a mini Olympics event with various sports and games. Set up different stations for activities like sack races, relay races, basketball shooting, and obstacle courses. Give out medals or certificates to the participants.\n6. Cooking Party: Have a cooking-themed party where the kids can prepare their own mini pizzas, cupcakes, or cookies. Provide toppings, frosting, and decorating supplies, and let them get hands-on in the kitchen.\n7. Superhero Training Camp: Create a superhero-themed party where the kids can engage in fun training activities. Set up an obstacle course, have them design their own superhero capes or masks, and organize superhero-themed games and challenges.\n8. Outdoor Adventure: Plan an outdoor adventure party at a local park or nature reserve. Arrange activities like hiking, nature scavenger hunts, or a picnic with games. Encourage exploration and appreciation for the outdoors.\nRemember to tailor the activities to the birthday child's interests and preferences. Have a great celebration!\n### Human: {prompt}\n### Assistant:"

prompt = "How to make banana bread?"

input_str = template.format(prompt=prompt)

model = TextGeneration(model_path="deployment")

print(model(input_str, generation_config=generation_config).generations[0].text)
"""
Banana bread is easy enough if you want recipe information on how long you should bake it
"""

Prompt template


  ### Assistant:
  ### Human:{prompt}
  ### Assistant:

Sparsification

For details on how this model was sparsified, see the recipe.yaml in this repo and follow the instructions below.

git clone https://github.com/neuralmagic/sparseml
pip install -e "sparseml[transformers]"
python sparseml/src/sparseml/transformers/sparsification/obcq/obcq.py LLM360/AmberChat open_platypus --recipe recipe.yaml --save True
python sparseml/src/sparseml/transformers/sparsification/obcq/export.py --task text-generation --model_path obcq_deployment 
cp deployment/model.onnx deployment/model-orig.onnx

Run this kv-cache injection to speed up the model at inference by caching the Key and Value states:

import os
import onnx
from sparseml.exporters.kv_cache_injector import KeyValueCacheInjector
input_file = "deployment/model-orig.onnx"
output_file = "deployment/model.onnx"
model = onnx.load(input_file, load_external_data=False)
model = KeyValueCacheInjector(model_path=os.path.dirname(input_file)).apply(model)
onnx.save(model, output_file)
print(f"Modified model saved to: {output_file}")

Follow the instructions on our One Shot With SparseML page for a step-by-step guide for performing one-shot quantization of large language models.

Slack

For further support, and discussions on these models and AI in general, join Neural Magic's Slack Community

Downloads last month
13
Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for nm-testing/AmberChat-pruned50-quant-ds

Base model

LLM360/AmberChat
Quantized
(9)
this model