title: Heeha | |
app_file: app.py | |
sdk: gradio | |
sdk_version: 5.20.0 | |
# Llama 3.2 3B Chat Interface | |
This project provides a Gradio web interface for interacting with the Llama 3.2 3B model using Hugging Face Transformers. | |
## Prerequisites | |
- Python 3.8 or higher | |
- CUDA-capable GPU (recommended for better performance) | |
- Hugging Face account with access to Llama 3.2 models | |
## Setup | |
1. Clone this repository | |
2. Install the required dependencies: | |
```bash | |
pip install -r requirements.txt | |
``` | |
3. Set up your Hugging Face token as an environment variable: | |
```bash | |
export HF_TOKEN="your_huggingface_token_here" | |
``` | |
You can get your token from: https://huggingface.co/settings/tokens | |
## Usage | |
Run the application: | |
```bash | |
python app.py | |
``` | |
The Gradio interface will be available at `http://localhost:7860` by default. | |
## Features | |
- Interactive chat interface using the Transformers pipeline | |
- Adjustable generation parameters (max new tokens and temperature) | |
- Example prompts for quick testing | |
- Automatic GPU utilization when available | |
- Uses bfloat16 precision for better performance | |
- Secure token handling through environment variables | |
## Note | |
You need to have access to the Llama 3.2 models on Hugging Face. You can request access at: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct | |