Updated with Ryzen AI, TGI and Benchmarking sections
Browse files
README.md
CHANGED
@@ -16,8 +16,16 @@ enabling high performance and high efficiency to make the world smarter.
|
|
16 |
|
17 |
# Getting Started with Hugging Face Transformers
|
18 |
|
19 |
-
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model.
|
22 |
General Linux and ML experience is a required pre-requisite.
|
23 |
|
@@ -85,6 +93,18 @@ Here are a few of the more popular ones to get you started:
|
|
85 |
|
86 |
Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.
|
87 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
88 |
# Useful Links and Blogs
|
89 |
|
90 |
- Check out our blog titled [Run a Chatgpt-like Chatbot on a Single GPU with ROCm](https://huggingface.co/blog/chatbot-amd-gpu)
|
|
|
16 |
|
17 |
# Getting Started with Hugging Face Transformers
|
18 |
|
19 |
+
AMD’s Ryzen™ AI family of laptop processors provide users with an integrated Neural Processing Unit (NPU)
|
20 |
+
which offloads the host CPU and GPU from AI processing tasks. Ryzen™ AI software consists of the Vitis ™ AI
|
21 |
+
execution provider (EP) for ONNX Runtime combined with quantization tools and a pre-optimized model zoo.
|
22 |
+
All of this is made possible based on Ryzen™ AI technology built on AMD XDNA™ architecture,
|
23 |
+
purpose-built to run AI workloads efficiently and locally,
|
24 |
+
offering a host of benefits for the developer innovating the next groundbreaking AI app. Details on getting started
|
25 |
+
with Hugging Face models are available on the [Optimum page](https://moon-ci-docs.huggingface.co/docs/optimum-amd/pr_29/en/ryzenai/overview)
|
26 |
+
|
27 |
+
The following section describes how to use the most common transformers on Hugging Face
|
28 |
+
for inference workloads on select AMD Instinct™ accelerators and AMD Radeon™ GPUs using the AMD ROCm software ecosystem.
|
29 |
This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model.
|
30 |
General Linux and ML experience is a required pre-requisite.
|
31 |
|
|
|
93 |
|
94 |
Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.
|
95 |
|
96 |
+
# Serving a model with TGI
|
97 |
+
|
98 |
+
Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale.
|
99 |
+
TGI is already usable in production on AMD Instinct™ GPUs through the docker image `ghcr.io/huggingface/text-generation-inference:1.2-rocm`.
|
100 |
+
Make sure to refer to the [documentation](https://huggingface.co/docs/text-generation-inference/supported_models#supported-hardware)
|
101 |
+
concerning the support and any limitations.
|
102 |
+
|
103 |
+
# Benchmarking
|
104 |
+
|
105 |
+
The [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) is available as a utility to easily benchmark the performance of transformers on AMD GPUs,
|
106 |
+
across normal and distributed settings, with various supported optimizations and quantization schemes.
|
107 |
+
|
108 |
# Useful Links and Blogs
|
109 |
|
110 |
- Check out our blog titled [Run a Chatgpt-like Chatbot on a Single GPU with ROCm](https://huggingface.co/blog/chatbot-amd-gpu)
|