AI & ML interests

Run open source LLMs across CPU and GPU without changing the binary in Rust and Wasm locally!

Recent Activity

apepkuss79  updated a model about 19 hours ago
second-state/Phi-3.5-MoE-instruct-GGUF
apepkuss79  updated a model about 21 hours ago
second-state/Falcon3-1B-Instruct-GGUF
apepkuss79  updated a model about 21 hours ago
second-state/Falcon3-3B-Instruct-GGUF
View all activity

Run Open source LLMs and create OpenAI-compatible API services for the Llama2 series of LLMs locally With LlamaEdge!

Give it a try

Run a single command in your command line terminal.

bash <(curl -sSfL 'https://raw.githubusercontent.com/LlamaEdge/LlamaEdge/main/run-llm.sh') --interactive

Follow the on-screen instructions to install the WasmEdge Runtime and download your favorite open-source LLM. Then, choose whether you want to chat with the model via the CLI or via a web UI.

See it in action | GitHub | Docs

Why?

LlamaEdge, powered by Rust and WasmEdge, provides a strong alternative to Python in AI inference.

  • Lightweight. The total runtime size is 30MB.
  • Fast. Full native speed on GPUs.
  • Portable. Single cross-platform binary on different CPUs, GPUs, and OSes.
  • Secure. Sandboxed and isolated execution on untrusted devices.
  • Container-ready. Supported in Docker, containerd, Podman, and Kubernetes.

Learn more

Please visit the LlamaEdge project to learn more.

datasets

None public yet