Spaces:

Ahren09
/

llavaguard

Runtime error

App Files Files Community

Ahren09 commited on Apr 25, 2024

Commit

4bd21b6

verified ·

1 Parent(s): 5ca4e86

Delete README.md

Browse files

Files changed (1) hide show

README.md +0 -81

README.md DELETED Viewed

@@ -1,81 +0,0 @@
-# LLaVAGuard
-PyTorch implementation for the paper "*[LLaVAGuard: Safety Guardrails for Multimodal Large Language Models against Jailbreak Attacks](#)*"
-LLaVAGuard is a novel framework that offers multimodal safety guardrails to any input prompt. The safety guardrails are specifically optimized to minimize the likelihood of generating harmful responses on LLaVA-v1.5 model. We have also demonstrated  the transferability of these guardrails to other prominent MLLMs, including GPT-4V, MiniGPT-4, and InstructBLIP, thereby broadening the scope of our solution.
-## Project Structure
-- **`cal_metrics.py`:** Summarizing the perplexity metrics over all examples
-- **`get_metric.py`**: Script for calculating detoxify and Perspective API metrics.
-- **`eval_configs`:** Configuration files for model evaluations, including settings for llama and MiniGPT-4.
-- **`image_safety_patch.py`, `text_safety_patch.py`:** Scripts for generating safety patches from images and text.
-- **`instructblip_*.py`:** Scripts related to the InstructBLIP model, including defense strategies against constrained and unconstrained attacks, and question answering.
-- **`lavis`:** Submodule for the InstructBLIP model, which contains the dataset builders, models, processors, projects, runners, and tasks for various multimodal learning purposes.
-- **`metric`:** Implementations of metrics such as detoxify and Perspective API.
-- **`minigpt_*.py`:** Scripts related to the MiniGPT-4 model, including constrained and unconstrained inference, and question answering.
-- **`requirements.txt`:** Required Python packages for setting up the project.
-- **`scripts`:** Shell scripts for running all experiments.
-- **`utils.py`:** Utility functions supporting various operations across the project, such as image loading and preprocessing.
-- **`visual`:** Scripts for visualizing the overall toxicity results from InstructBLIP and MiniGPT-4 evaluations.
-- **`text_patch_heuristic`**: pre-defined text guardrails
-- **`text_patch_optimized`**: optimized text guardrails
-## Setup
-To get started with llavaguard, follow these setup steps:
-1. **Clone the Repository:**
-   ```bash
-   git clone <repository-url> llavaguard
-   cd llavaguard
-   ```
-2. **Install Dependencies:**
-   Make sure you have Python 3.10+ installed, then run:
-   ```bash
-   pip install -r requirements.txt
-   ```
-3. **Dataset Preparation:**
-   Download the two files from [Google Drive](https://drive.google.com/drive/folders/14vdgC4L-Je6egzmVOfVczQ3-j-IzBQio?usp=sharing) and put them under the project directory. Run:
-   ```bash
-   tar -xzvf adversarial_qna_images.tar.gz
-   tar -xzvf unconstrained_attack_images.tar.gz
-   ```
-## Usage
-The project includes several scripts and shell commands designed to perform specific tasks. Here are some examples:
-- Running constrained / unconstrained attack as well as the QNA task for the InstructBLIP model:
-  ```bash
-  bash scripts/run_instructblip_attack.sh
-  ```
-  This involves getting the results from the LLMs and calculating the metrics.
-  Procedures to run MiniGPT-4 are similar.
-- Running experiments for the baseline defense methods:
-  ```bash
-  bash scripts/run_instructblip_baseline.sh
-  ```
-- Running our LLaVAGuard defense methods:
-    ```bash
-  bash scripts/run_instructblip_safety_patch.sh
-  ```
-## Contributing
-Contributions to llavaguard are welcomed. Please submit pull requests to the repository with a clear description of the changes and the purpose behind them.
-## License
-This project is released under the Apache 2.0 License.