Spaces:

Arm
/

README

Running

App Files Files Community

ARMMcBrideT commited on 11 days ago

Commit

40a9963

verified ·

1 Parent(s): 334cc2a

Update README.md

Browse files

Files changed (1) hide show

README.md +87 -150

README.md CHANGED Viewed

@@ -1,152 +1,89 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-    <meta charset="UTF-8">
-    <title>Arm Hugging Face Learning Paths</title>
-    <style>
-        table {
-            width: 100%;
-            border-collapse: collapse;
-            font-family: Arial, sans-serif;
-        }
-        th, td {
-            border: 1px solid #ddd;
-            padding: 8px;
-        }
-        th {
-            background-color: #f4f4f4;
-            text-align: left;
-        }
-        tr:nth-child(even) {
-            background-color: #f9f9f9;
-        }
-        tr:hover {
-            background-color: #f1f1f1;
-        }
-        h2, p {
-            font-family: Arial, sans-serif;
-        }
-    </style>
-</head>
-<body>
-<h2>Explore Arm-Optimized Learning Paths on Hugging Face</h2>
-<p>
-    Discover curated <strong>Learning Paths</strong> that showcase AI models optimized for Arm platforms across key market applications.
-    Each learning path highlights specific models featured within our dedicated Hugging Face <strong>Model Collections</strong>,
-    simplifying your journey from learning to deployment on Arm technologies.
-</p>
-<table>
-    <thead>
-        <tr>
-            <th>Learning Path</th>
-            <th>Market Application</th>
-            <th>Model(s) Featured</th>
-            <th>Arm Platform</th>
-        </tr>
-    </thead>
-    <tbody>
-        <tr>
-            <td>Build a RAG application using Zilliz Cloud on Arm servers</td>
-            <td>Cloud, Datacenter</td>
-            <td>all-MiniLM-L6-v2</td>
-            <td>AWS Graviton3 (c7g.2xlarge)</td>
-        </tr>
-        <tr>
-            <td>Accelerate NLP models from Hugging Face on Arm servers</td>
-            <td>Cloud, Datacenter</td>
-            <td>DistilBERT base uncased finetuned SST-2</td>
-            <td>AWS Graviton3</td>
-        </tr>
-        <tr>
-            <td>Deploy a Large Language Model (LLM) chatbot with llama.cpp using KleidiAI on Arm servers</td>
-            <td>Cloud, Datacenter</td>
-            <td>Dolphin 2.9.4 Llama 3.1 8b</td>
-            <td>AWS Graviton4 (r8g.16xlarge)</td>
-        </tr>
-        <tr>
-            <td>Run a Large Language Model (LLM) chatbot with PyTorch using KleidiAI on Arm servers</td>
-            <td>Cloud, Datacenter</td>
-            <td>Llama-3.1-8B-Instruct</td>
-            <td>Arm-based instance with at least 16 CPUs</td>
-        </tr>
-        <tr>
-            <td>Deploy a RAG-based Chatbot with llama-cpp-python using KleidiAI on Google Axion processors</td>
-            <td>Cloud, Datacenter</td>
-            <td>Llama3.1 8b</td>
-            <td>Google Cloud Axion</td>
-        </tr>
-        <tr>
-            <td>Build an Android chat app with Llama, KleidiAI, ExecuTorch, and XNNPACK</td>
-            <td>Edge Device (Smartphone)</td>
-            <td>Llama-3.2-1B-Instruct</td>
-            <td>Smartphone based on Arm CPU (with i8mm support)</td>
-        </tr>
-        <tr>
-            <td>Run a local LLM chatbot on a Raspberry Pi 5</td>
-            <td>Edge Device (Raspberry Pi)</td>
-            <td>Orca-Mini-3B</td>
-            <td>Raspberry Pi 5</td>
-        </tr>
-        <tr>
-            <td>Build an Android chat application with ONNX Runtime API</td>
-            <td>Edge Device (Smartphone)</td>
-            <td>Phi-3-vision-128k-instruct-onnx-cuda</td>
-            <td>Samsung Galaxy S24 Android smartphone</td>
-        </tr>
-        <tr>
-            <td>Run an LLM chatbot with rtp-llm on Arm-based servers</td>
-            <td>Cloud, Datacenter</td>
-            <td>Qwen2 0.5B-Instruct</td>
-            <td>Arm server based on Neoverse N2 or Arm Neoverse V2</td>
-        </tr>
-        <tr>
-            <td>Build and Run a Virtual Large Language Model (vLLM) on Arm Servers</td>
-            <td>Cloud, Datacenter</td>
-            <td>Qwen2.5-0.5B-Instruct</td>
-            <td>Arm-based instance with at least 8 CPUs and 16 GB RAM</td>
-        </tr>
-        <tr>
-            <td>Run a Natural Language Processing (NLP) model from Hugging Face on Arm servers</td>
-            <td>Cloud, Datacenter</td>
-            <td>twitter-roberta-base-sentiment-latest</td>
-            <td>Arm AArch64 CPU based server</td>
-        </tr>
-        <tr>
-            <td>Profile the Performance of AI and ML Mobile Applications on Arm</td>
-            <td>Edge Device (Smartphone)</td>
-            <td>MobileNet V2</td>
-            <td>Arm-based Android smartphone</td>
-        </tr>
-        <tr>
-            <td>Get started with object detection using a Jetson Orin Nano</td>
-            <td>Edge Device</td>
-            <td>MobileNet V2</td>
-            <td>NVIDIA Jetson Oron Nano</td>
-        </tr>
-        <tr>
-            <td>Create a ChatGPT voice bot on a Raspberry Pi</td>
-            <td>Edge Device (Raspberry Pi)</td>
-            <td>gpt-4-turbo-preview</td>
-            <td>Raspberry Pi 5</td>
-        </tr>
-        <tr>
-            <td>LLM inference on Android with KleidiAI, MediaPipe, and XNNPACK</td>
-            <td>Edge Device (Smartphone)</td>
-            <td>Gemma 2B</td>
-            <td>Google Pixel 8 Pro (Android phone with support for i8mm)</td>
-        </tr>
-        <tr>
-            <td>Run Llama 3 on a Raspberry Pi 5 using ExecuTorch</td>
-            <td>Edge Device (Raspberry Pi)</td>
-            <td>Llama 3.1 8B</td>
-            <td>Raspberry Pi 5</td>
-        </tr>
-    </tbody>
-</table>
-<p><i><small>Note: The data collated here is sourced from Arm and third parties. While Arm uses reasonable efforts to keep this information accurate, Arm does not warrant (express or implied) or provide any guarantee of data correctness due to the ever-evolving AI and software landscape. Any links to third party sites and resources are provided for ease and convenience. Your use of such third-party sites and resources is subject to the third party’s terms of use, and use is at your own risk.</small></i></p>
 </body>
-</html>

+---
+title: README
+emoji: 🦀
+colorFrom: indigo
+colorTo: purple
+sdk: static
+pinned: false
+---
+<p>Arm’s AI development resources ensure you can deploy at pace, achieving best performance on Arm by default. Our aim is to make your AI development easier, ensuring integration with all major operating systems and AI frameworks, enabling portability for deploying AI on Arm at scale.</p>
+    <p>Discover below some key resources and content from Arm, including our software libraries and tools, that enable you to optimize for Arm architectures and pass-on significant performance uplift for models – from traditional ML and computer vision workloads to small and large language models - running on Arm-based devices.</p>
+    <br>
+    <strong>Arm and Meta: <a href="https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf" target="_blank">Llama 3.2</a><br>Accelerated cloud to edge AI performance</strong>
+    <p>The availability of smaller LLMs that enable fundamental text-based generative AI workloads, such as Llama 3.2 1B and 3B, are critical to enabling AI inference at scale. Running the new Llama 3.2 3B LLM on Arm-powered mobile devices through the Arm CPU optimized kernel leads to a 5x improvement in prompt processing and 3x improvement in token generation, achieving 19.92 tokens per second in the generation phase. This means less latency when processing AI workloads on the device and a far faster overall user experience. Also, the more AI processed at the edge, the more power that is saved from data traveling to and from the cloud, leading to energy and cost savings.</p>
+    <p>Alongside running small models at the edge, we are also able to run larger models, such as Llama 3.2 11B and 90B, in the cloud. The 11B and 90B models are a great fit for CPU based inference workloads in the cloud that generate text and image, as our data on Arm Neoverse V2 shows. When we run the 11B image and text model on the Arm-based AWS Graviton4, we can achieve 29.3 tokens per second in the generation phase. When you consider that the human reading speed is around 5 tokens per second, it’s far outpacing that.</p>
+  <ul><p>
+    <li><a href="https://newsroom.arm.com/news/ai-inference-everywhere-with-new-llama-llms-on-arm" target="_blank">Accelerating and Scaling AI Inference Everywhere with New Llama 3.2 LLMs on Arm</a></li>
+    <li><a href="https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/" target="_blank">Meta Llama 3.2 Blog</a></li>
+    <li><a href=" https://www.llama.com/docs/getting-the-models/1b3b-partners" target="_blank">Meta Llama 3.2 1b/3/b Partner Guide</a></li>
+    <li><a href="https://www.youtube.com/watch?v=AVqm7SfNQrw" target="_blank">How Arm and Meta are Transforming AI Software Development</a></li>
+  <li><a href="https://www.arm.com/markets/artificial-intelligence/software" target="_blank">Arm AI Software Page</a></li>
+  </p></ul>
+  <br>
+  <strong>Arm Kleidi: Unleashing Mass-Market AI Performance on Arm</strong>
+    <p>Arm Kleidi is a targeted software suite, expediting optimizations for any framework and enabling accelerations for billions of AI workloads across Arm-based devices everywhere. Application developers achieve top performance by default, with no additional work or investment in new skills or tools training required.</p>
+  <p><b>Useful Resources on Arm Kleidi:</b></p>
+  <ul><p>
+    <li><a href="https://newsroom.arm.com/blog/kleidiai-integration-mediapipe" target="_blank">KleidiAI integration with Google's MediaPipe framework</a></li>
+    <li>Arm KleidiAI for optimizing any AI framework: <a href="https://gitlab.arm.com/kleidi/kleidiai" target="_blank">Gitlab repo</a> and <a href="https://community.arm.com/arm-community-blogs/b/ai-and-ml-blog/posts/kleidiai" target="_blank"> blog</a></li>
+        <li>Arm KleidiCV for optimizing any computer vision framework: <a href="https://gitlab.arm.com/kleidi/kleidicv" target="_blank">Gitlab repo</a> and <a href="https://community.arm.com/arm-community-blogs/b/ai-and-ml-blog/posts/kleidicv" target="_blank"> blog</a></li>
+        <li><a href="https://github.com/ARM-software/ComputeLibrary" target="_blank">Arm Compute Library for all AI software</a></li>
+    </p></ul>
+  <br>
+  <strong>Running LLMs on Mobile</strong>
+  <br>
+    <p>Our foundation of pervasiveness, flexible performance and energy efficiency mean that Arm CPUs are already the hardware of choice for a variety of AI workloads. Alongside Arm-based servers excelling with LLM workloads, the Arm Kleidi software suite, optimizations to our software libraries, combined with the  <a href="https://github.com/ggerganov/llama.cpp" target="_blank">open-source llama.cpp project</a> enable generative AI to run efficiently on mobile devices.</p>
+    <p>Our work includes a virtual assistant demo which at first utilized Meta’s Llama2-7B LLM on mobile via a chat-based application, and has since expanded to include the Llama3 model and Phi-3 3.8B. <a href="https://community.arm.com/arm-community-blogs/b/ai-and-ml-blog/posts/generative-ai-on-mobile-on-arm-cpu" target="_blank">You can learn more about the technical implementation of the demos here. </a></p>
+    <p><b>Find out more about the community contributions that make this happen:</b></p>
+    <ul><p>
+        <li><a href="https://huggingface.co/TheBloke/Llama-2-7B-GGUF" target="_blank">Llama-2-7B-GGUF</a></li>
+        <li><a href="https://huggingface.co/TheBloke/phi-2-GGUF" target="_blank">Phi-2-GGUF</a></li>
+        <li><a href="https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF" target="_blank">Meta-Llama-3-8B-Instruct-GGUF</a></li>
+        <li><a href="https://huggingface.co/SanctumAI/Phi-3-mini-4k-instruct-GGUF" target="_blank">Phi-3-mini-4k-instruct-GGUF</a></li>
+        <!-- Add more links here -->
+    </p></ul>
+    <p>These advancements are also highlighted in our Learning Paths below.</p>
+<br>
+    <strong>AI on Arm in the Cloud</strong>
+  <br>
+    <p>Arm Neoverse platforms give our infrastructure partners access to leading performance, efficiency and unparalleled flexibility to innovate in pursuit of the optimal solutions for emerging AI workloads. The flexibility of the Neoverse platform enables our innovative hardware partners to closely integrate additional compute acceleration into their designs, creating a new generation of built-for-AI custom data center silicon.</p>
+    <p><b>Read the latest on AI-on-Neoverse:</b></p>
+      <ul><p><li>
+        <a href="https://www.arm.com/developer-hub/servers-and-cloud-computing/ai-cloud" target="_blank">Accelerate Your GenAI, AI and ML Workloads on Arm CPUs</a>
+      </li>
+        <li>
+        <a href="https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/accelerating-sentiment-analysis-on-arm-neoverse-cpus" target="_blank">Accelerating Popular Hugging Face Models using Arm Neoverse</a>
+      </li>
+        <li>
+        <a href="https://newsroom.arm.com/blog/small-language-models-on-arm" target="_blank">Small Language Models: Efficient Arm Computing Enables a Custom AI Future</a>
+      </li>
+        <li>
+        <a href="https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/best-in-class-llm-performance-on-arm-neoverse-v1-based-aws-graviton3-servers" target="_blank">Best-in-class LLM Performance on Arm Neoverse V1 based AWS Graviton3 CPUs</a>
+      </li></p>
+      </ul>
+    <!-- Add relevant links or content here -->
+  <br>
+  <strong>Arm Learning Paths</strong>
+  <br><p>Tutorials designed to help you develop quality Arm software faster.</p>
+  <ul><p>
+    <li>
+        <a href="https://learn.arm.com/learning-paths/smartphones-and-mobile/kleidiai-on-android-with-mediapipe-and-xnnpack/" class="underline" target="_blank">LLMs on Android with KleidiAI, MediaPipe and XNNPACK</a>
+      </li>
+      <li>
+        <a href="https://learn.arm.com/learning-paths/servers-and-cloud-computing/llama-cpu/llama-chatbot/" class="underline" target="_blank">Run a Large Language model (LLM) chatbot on Arm servers</a>
+      </li>
+      <li>
+        <a href="https://learn.arm.com/learning-paths/servers-and-cloud-computing/nlp-hugging-face/pytorch-nlp-hf/" target="_blank">Deploy an NLP model using PyTorch on Arm-based device</a>
+      </li>
+      <li>
+        <a href="https://learn.arm.com/learning-paths/embedded-systems/llama-python-cpu/" target="_blank">Run a local LLM chatbot on a Raspberry Pi 5</a>
+      </li>
+      <li>
+        <a href="https://learn.arm.com/learning-paths/servers-and-cloud-computing/benchmark-nlp/" target="_blank">Accelerate Natural Language Processing (NLP) models from Hugging Face on Arm servers</a>
+      </li>
+    </p></ul>
+    <p>Contribute to our Learning Paths: <a href="https://github.com/ArmDeveloperEcosystem/arm-learning-paths/discussions/categories/ideas" target="_blank">suggest a new Learning Path</a> or <a href="https://learn.arm.com/learning-paths/cross-platform/_example-learning-path/" target="_blank">create one yourself </a>with support from the Arm community.</p>
+  <br>
+  <p><i><small>Note: The data collated here is sourced from Arm and third parties.  While Arm uses reasonable efforts to keep this information accurate, Arm does not warrant (express or implied) or provide any guarantee of data correctness due to the ever-evolving AI and software landscape.  Any links to third party sites and resources are provided for ease and convenience.  Your use of such third-party sites and resources is subject to the third party’s terms of use, and use is at your own risk.</small></i></p>
 </body>