juntaoyuan
/

llawa

Inference Endpoints

Model card Files Files and versions Community

llawa / README.md

juntaoyuan's picture

Update README.md

f2ed886 12 months ago

|

1.93 kB

	---
	license: apache-2.0
	---

	<center>
	<h3>Welcome to llawa</h3>
	<img src="https://huggingface.co/juntaoyuan/llawa/resolve/main/llawa-logo.png"/>
	<br/><i>a.k.a Llama2 + Wasm QA</i>
	</center>


	The models in this repo are Llama2 7b chat models further fine-tuned with Wasm-related Q&As.
	Instead of struggling with Python and PyTorch, the simplest way to run them on your own laptops, servers, or edge devices is to use the [WasmEdge Runtime](https://github.com/WasmEdge/WasmEdge).
	Learn more about this [fast, lightweight, portable, and ZERO Python dependency approach](https://medium.com/stackademic/fast-and-portable-llama2-inference-on-the-heterogeneous-edge-a62508e82359) for running AI applications!

	1. Install WasmEdge

	```
	curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh \| bash -s -- --plugins wasi_nn-ggml
	```

	2. Download the chat app. It is a portal Wasm bytecode app. It runs across CPUs, GPUs, and OSes.

	```
	curl -LO https://github.com/second-state/llama-utils/raw/main/chat/llama-chat.wasm
	```

	3. Chat with the model on CLI

	```
	wasmedge --dir .:. \
	--nn-preload default:GGML:AUTO:llama-2-7b-chat-wasm-q5_k_m.gguf \
	llama-chat.wasm --reverse-prompt "</s>"
	```

	Below is an example chat session:

	```
	[USER]:
	What is Wasm?
	[ASSISTANT]:
	Wasm is a binary instruction format for a stack-based virtual machine. It's a portable target for the WebAssembly runtime.
	[USER]:
	What is WASI-NN?
	[ASSISTANT]:
	WASI-NN is a Wasm system interface for neural networks. It allows Wasm apps to use and integrate with different neural network frameworks.
	[USER]:
	Which runtimes support it?
	[ASSISTANT]:
	Wasm is supported by a number of runtimes including WasmEdge, Wasmtime, and Wasmer.
	[USER]:
	How do I run AI workloads in serverless functions?
	[ASSISTANT]:
	You can use Wasm to run AI workloads in serverless functions. WasmEdge supports running AI workloads using the WASI-NN interface.
	```