Heeha / README.md
mgeorgi's picture
Upload folder using huggingface_hub
d4448fe verified

A newer version of the Gradio SDK is available: 5.22.0

Upgrade
metadata
title: Heeha
app_file: app.py
sdk: gradio
sdk_version: 5.20.0

Llama 3.2 3B Chat Interface

This project provides a Gradio web interface for interacting with the Llama 3.2 3B model using Hugging Face Transformers.

Prerequisites

  • Python 3.8 or higher
  • CUDA-capable GPU (recommended for better performance)
  • Hugging Face account with access to Llama 3.2 models

Setup

  1. Clone this repository
  2. Install the required dependencies:
    pip install -r requirements.txt
    
  3. Set up your Hugging Face token as an environment variable:
    export HF_TOKEN="your_huggingface_token_here"
    
    You can get your token from: https://huggingface.co/settings/tokens

Usage

Run the application:

python app.py

The Gradio interface will be available at http://localhost:7860 by default.

Features

  • Interactive chat interface using the Transformers pipeline
  • Adjustable generation parameters (max new tokens and temperature)
  • Example prompts for quick testing
  • Automatic GPU utilization when available
  • Uses bfloat16 precision for better performance
  • Secure token handling through environment variables

Note

You need to have access to the Llama 3.2 models on Hugging Face. You can request access at: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct