dbouius commited on
Commit
738e763
·
1 Parent(s): b442e82

Updated with Ryzen AI, TGI and Benchmarking sections

Browse files
Files changed (1) hide show
  1. README.md +22 -2
README.md CHANGED
@@ -16,8 +16,16 @@ enabling high performance and high efficiency to make the world smarter.
16
 
17
  # Getting Started with Hugging Face Transformers
18
 
19
- This section describes how to use the most common transformers on Hugging Face
20
- for inference workloads on AMD accelerators using the AMD ROCm software ecosystem.
 
 
 
 
 
 
 
 
21
  This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model.
22
  General Linux and ML experience is a required pre-requisite.
23
 
@@ -85,6 +93,18 @@ Here are a few of the more popular ones to get you started:
85
 
86
  Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.
87
 
 
 
 
 
 
 
 
 
 
 
 
 
88
  # Useful Links and Blogs
89
 
90
  - Check out our blog titled [Run a Chatgpt-like Chatbot on a Single GPU with ROCm](https://huggingface.co/blog/chatbot-amd-gpu)
 
16
 
17
  # Getting Started with Hugging Face Transformers
18
 
19
+ AMD’s Ryzen™ AI family of laptop processors provide users with an integrated Neural Processing Unit (NPU)
20
+ which offloads the host CPU and GPU from AI processing tasks. Ryzen™ AI software consists of the Vitis ™ AI
21
+ execution provider (EP) for ONNX Runtime combined with quantization tools and a pre-optimized model zoo.
22
+ All of this is made possible based on Ryzen™ AI technology built on AMD XDNA™ architecture,
23
+ purpose-built to run AI workloads efficiently and locally,
24
+ offering a host of benefits for the developer innovating the next groundbreaking AI app. Details on getting started
25
+ with Hugging Face models are available on the [Optimum page](https://moon-ci-docs.huggingface.co/docs/optimum-amd/pr_29/en/ryzenai/overview)
26
+
27
+ The following section describes how to use the most common transformers on Hugging Face
28
+ for inference workloads on select AMD Instinct™ accelerators and AMD Radeon™ GPUs using the AMD ROCm software ecosystem.
29
  This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model.
30
  General Linux and ML experience is a required pre-requisite.
31
 
 
93
 
94
  Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.
95
 
96
+ # Serving a model with TGI
97
+
98
+ Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale.
99
+ TGI is already usable in production on AMD Instinct™ GPUs through the docker image `ghcr.io/huggingface/text-generation-inference:1.2-rocm`.
100
+ Make sure to refer to the [documentation](https://huggingface.co/docs/text-generation-inference/supported_models#supported-hardware)
101
+ concerning the support and any limitations.
102
+
103
+ # Benchmarking
104
+
105
+ The [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) is available as a utility to easily benchmark the performance of transformers on AMD GPUs,
106
+ across normal and distributed settings, with various supported optimizations and quantization schemes.
107
+
108
  # Useful Links and Blogs
109
 
110
  - Check out our blog titled [Run a Chatgpt-like Chatbot on a Single GPU with ROCm](https://huggingface.co/blog/chatbot-amd-gpu)