|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
<center> |
|
<h3>Welcome to llawa</h3> |
|
<img src="https://huggingface.co/juntaoyuan/llawa/resolve/main/llawa-logo.png"/> |
|
<br/><i>a.k.a Llama2 + Wasm QA</i> |
|
</center> |
|
|
|
|
|
The models in this repo are Llama2 7b chat models further fine-tuned with Wasm-related Q&As. |
|
Instead of struggling with Python and PyTorch, the simplest way to run them on your own laptops, servers, or edge devices is to use the [WasmEdge Runtime](https://github.com/WasmEdge/WasmEdge). |
|
Learn more about this [fast, lightweight, portable, and ZERO Python dependency approach](https://medium.com/stackademic/fast-and-portable-llama2-inference-on-the-heterogeneous-edge-a62508e82359) for running AI applications! |
|
|
|
1. Install WasmEdge |
|
|
|
``` |
|
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugins wasi_nn-ggml |
|
``` |
|
|
|
2. Download the chat app. It is a portal Wasm bytecode app. It runs across CPUs, GPUs, and OSes. |
|
|
|
``` |
|
curl -LO https://github.com/second-state/llama-utils/raw/main/chat/llama-chat.wasm |
|
``` |
|
|
|
3. Chat with the model on CLI |
|
|
|
``` |
|
wasmedge --dir .:. \ |
|
--nn-preload default:GGML:AUTO:llama-2-7b-chat-wasm-q5_k_m.gguf \ |
|
llama-chat.wasm --reverse-prompt "</s>" |
|
``` |
|
|
|
Below is an example chat session: |
|
|
|
``` |
|
[USER]: |
|
What is Wasm? |
|
[ASSISTANT]: |
|
Wasm is a binary instruction format for a stack-based virtual machine. It's a portable target for the WebAssembly runtime. |
|
[USER]: |
|
What is WASI-NN? |
|
[ASSISTANT]: |
|
WASI-NN is a Wasm system interface for neural networks. It allows Wasm apps to use and integrate with different neural network frameworks. |
|
[USER]: |
|
Which runtimes support it? |
|
[ASSISTANT]: |
|
Wasm is supported by a number of runtimes including WasmEdge, Wasmtime, and Wasmer. |
|
[USER]: |
|
How do I run AI workloads in serverless functions? |
|
[ASSISTANT]: |
|
You can use Wasm to run AI workloads in serverless functions. WasmEdge supports running AI workloads using the WASI-NN interface. |
|
``` |
|
|