torchtune research repo: token coloring (colorful llama)
Playground to try out token coloring with TorchTune.
The repo was generated using the alpha version of torchtune.
Brief notes:
- The starting recipe is based on the Alpaca Llama2 7B full finetune recipe (switched to bf16).
- I assume
output/
is used to store model outputs andmodel/
is used to store the base model checkpoints.
For the colorful
recipe:
- I copied a lot of functionality (like the actual model definition, dataset, etc) from torchtune repository directly since I needed to make changes.
- I reduced the flexiblity of the recipe (e.g. cannot specify the model or tokenizer) and increased it in other ways (e.g. can pass in a dataset path directly).
- I added intermediate checkpointing (i.e. every
n
steps) and automatically upload the checkpoint to HuggingFace Hub.
Getting started
The below instructions can be copy-pasted as is on to a running instance. They assume that the HF_TOKEN
environment variable is set with a valid token.
# for RunPod
cd /workspace
git clone [email protected]:pytorch-labs/torchtune.git
cd torchtune
pip install -e .
cd /workspace
git clone [email protected]:laurencer/torchtune-colorful-llama.git
cd torchtune-colorful-llama
# for wandb support
pip install wandb
mkdir -p model/
tune download --repo-id meta-llama/Llama-2-7b --output-dir model/
tune convert_checkpoint --checkpoint-path model/consolidated.00.pth --output-path model/llama2_native.tune
mkdir -p output/
# tune --nnodes 1 --nproc_per_node 1 ./colorful/full_finetune.py --config ./colorful/basic_config.yaml
nohup tune --nnodes 1 --nproc_per_node 1 ./colorful/full_finetune.py --config ./colorful/basic_config.yaml 2>&1 > training_log_$(date "+%Y.%m.%d_%H.%M.%S").log &
sleep 1
tail -f training_log_*.log
Baselines
Two baseline configs are provided in the baseline
directory.
We forked the original recipe to support customizing the location/path of the Alpaca dataset.
# tune --nnodes 1 --nproc_per_node 1 ./baseline/full_finetune.py --config ./baseline/baseline_config.yaml
nohup tune --nnodes 1 --nproc_per_node 1 ./baseline/full_finetune.py --config ./baseline/baseline_config.yaml 2>&1 > training_log_$(date "+%Y.%m.%d_%H.%M.%S").log &
sleep 1
tail -f training_log_*.log
The adversarial config uses a dataset that is equivalent to 4x the original alpaca cleaned dataset with extra examples that include prompt injection attempts. See token coloring description for more info.
# tune --nnodes 1 --nproc_per_node 1 ./baseline/full_finetune.py --config ./baseline/adversarial_config.yaml
nohup tune --nnodes 1 --nproc_per_node 1 ./baseline/full_finetune.py --config ./baseline/adversarial_config.yaml 2>&1 > training_log_$(date "+%Y.%m.%d_%H.%M.%S").log &
sleep 1
tail -f training_log_*.log
Colorful
The colorful
directory implements the changes required to support token coloring. This includes a custom dataset implementation and training script.