Spaces:

amd
/

README

Running

App Files Files Community

dbouius commited on Dec 6, 2023

Commit

738e763

1 Parent(s): b442e82

Updated with Ryzen AI, TGI and Benchmarking sections

Browse files

Files changed (1) hide show

README.md +22 -2

README.md CHANGED Viewed

@@ -16,8 +16,16 @@ enabling high performance and high efficiency to make the world smarter.
 # Getting Started with Hugging Face Transformers
-This section describes how to use the most common transformers on Hugging Face
-for inference workloads on AMD accelerators using the AMD ROCm software ecosystem.
 This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model.
 General Linux and ML experience is a required pre-requisite.
@@ -85,6 +93,18 @@ Here are a few of the more popular ones to get you started:
 Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.
 # Useful Links and Blogs
 - Check out our blog titled [Run a Chatgpt-like Chatbot on a Single GPU with ROCm](https://huggingface.co/blog/chatbot-amd-gpu)

 # Getting Started with Hugging Face Transformers
+AMD’s Ryzen™ AI family of laptop processors provide users with an integrated Neural Processing Unit (NPU)
+which offloads the host CPU and GPU from AI processing tasks. Ryzen™ AI software consists of the Vitis ™ AI
+execution provider (EP) for ONNX Runtime combined with quantization tools and a pre-optimized model zoo.
+All of this is made possible based on Ryzen™ AI technology built on AMD XDNA™ architecture,
+purpose-built to run AI workloads efficiently and locally,
+offering a host of benefits for the developer innovating the next groundbreaking AI app. Details on getting started
+with Hugging Face models are available on the [Optimum page](https://moon-ci-docs.huggingface.co/docs/optimum-amd/pr_29/en/ryzenai/overview)
+The following section describes how to use the most common transformers on Hugging Face
+for inference workloads on select AMD Instinct™ accelerators and AMD Radeon™ GPUs using the AMD ROCm software ecosystem.
 This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model.
 General Linux and ML experience is a required pre-requisite.
 Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.
+# Serving a model with TGI
+Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale.
+TGI is already usable in production on AMD Instinct™ GPUs through the docker image `ghcr.io/huggingface/text-generation-inference:1.2-rocm`.
+Make sure to refer to the [documentation](https://huggingface.co/docs/text-generation-inference/supported_models#supported-hardware)
+concerning the support and any limitations.
+# Benchmarking
+The [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) is available as a utility to easily benchmark the performance of transformers on AMD GPUs,
+across normal and distributed settings, with various supported optimizations and quantization schemes.
 # Useful Links and Blogs
 - Check out our blog titled [Run a Chatgpt-like Chatbot on a Single GPU with ROCm](https://huggingface.co/blog/chatbot-amd-gpu)