Spaces:

Artples
/

LBook

Running

App Files Files Community

Artples commited on Apr 2, 2024

Commit

ee2b52a

verified ·

1 Parent(s): 5bbb076

Delete Finetuning_NoteBook.ipynb

Browse files

Files changed (1) hide show

Finetuning_NoteBook.ipynb +0 -513

Finetuning_NoteBook.ipynb DELETED Viewed

@@ -1,513 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "292aa39a",
-   "metadata": {},
-   "source": [
-    "# Installing Required Libraries!"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8f5ff902",
-   "metadata": {},
-   "source": [
-    "Installing required libraries, including trl, transformers, accelerate, peft, datasets, and bitsandbytes."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f74b4a0d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "# Checks if PyTorch is installed and installs it if not.\n",
-    "try:\n",
-    "    import torch\n",
-    "    print(\"PyTorch is installed!\")\n",
-    "except ImportError:\n",
-    "    print(\"PyTorch is not installed.\")\n",
-    "    !pip install -q torch\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d36b37f9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "!pip install -q --upgrade \"transformers==4.38.2\"\n",
-    "!pip install -q --upgrade \"datasets==2.16.1\"\n",
-    "!pip install -q --upgrade \"accelerate==0.26.1\"\n",
-    "!pip install -q --upgrade \"evaluate==0.4.1\"\n",
-    "!pip install -q --upgrade \"bitsandbytes==0.42.0\"\n",
-    "!pip install -q --upgrade \"trl==0.7.11\"\n",
-    "!pip install -q --upgrade \"peft==0.8.2\"\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e9f88bba",
-   "metadata": {},
-   "source": [
-    "# Load and Prepare the Dataset"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "df19b148",
-   "metadata": {},
-   "source": [
-    "The dataset is already formatted in a conversational format, which is supported by [trl](https://huggingface.co/docs/trl/index/), and ready for supervised finetuning."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "477e46f4",
-   "metadata": {},
-   "source": [
-    "\n",
-    "**Conversational format:**\n",
-    "\n",
-    "\n",
-    "```python {\"messages\": [{\"role\": \"system\", \"content\": \"You are...\"}, {\"role\": \"user\", \"content\": \"...\"}, {\"role\": \"assistant\", \"content\": \"...\"}]}\n",
-    "{\"messages\": [{\"role\": \"system\", \"content\": \"You are...\"}, {\"role\": \"user\", \"content\": \"...\"}, {\"role\": \"assistant\", \"content\": \"...\"}]}\n",
-    "{\"messages\": [{\"role\": \"system\", \"content\": \"You are...\"}, {\"role\": \"user\", \"content\": \"...\"}, {\"role\": \"assistant\", \"content\": \"...\"}]}\n",
-    "```\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "4f9f3d7a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "from datasets import load_dataset\n",
-    "    \n",
-    "# Load dataset from the hub\n",
-    "dataset = load_dataset(\"HuggingFaceH4/ultrachat_200k\", split=\"train_sft\")\n",
-    "    \n",
-    "dataset = dataset.shuffle(seed=42)\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "34a66934",
-   "metadata": {},
-   "source": [
-    "## Setting LoRA Config"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b34de536",
-   "metadata": {},
-   "source": [
-    "The `SFTTrainer` provides native integration with `peft`, simplifying the process of efficiently tuning \n",
-    "    Language Models (LLMs) using techniques such as [LoRA](\n",
-    "    https://magazine.sebastianraschka.com/p/practical-tips-for-finetuning-llms). The only requirement is to create \n",
-    "    the `LoraConfig` and pass it to the `SFTTrainer`. \n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "648afc1b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "from peft import LoraConfig\n",
-    "\n",
-    "peft_config = LoraConfig(\n",
-    "    lora_alpha=8,\n",
-    "    lora_dropout=0.05,\n",
-    "    r=6,\n",
-    "    bias=\"none\",\n",
-    "    target_modules=\"all-linear\",\n",
-    "    task_type=\"CAUSAL_LM\"\n",
-    ")\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6950f0c4",
-   "metadata": {},
-   "source": [
-    "## Setting the TrainingArguments"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bd721228",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "# Installing tensorboard to report the metrics\n",
-    "!pip install -q tensorboard\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "4ced0801",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "from transformers import TrainingArguments\n",
-    "\n",
-    "args = TrainingArguments(\n",
-    "    output_dir=\"temp_/tmp/model\",\n",
-    "    num_train_epochs=15,\n",
-    "    per_device_train_batch_size=3,\n",
-    "    gradient_accumulation_steps=2,\n",
-    "    gradient_checkpointing=True,\n",
-    "    gradient_checkpointing_kwargs={'use_reentrant': False},\n",
-    "    optim=\"adamw_torch_fused\",\n",
-    "    logging_steps=10,\n",
-    "    save_strategy='epoch',\n",
-    "    learning_rate=2e-05,\n",
-    "    bf16=True,\n",
-    "    max_grad_norm=0.3,\n",
-    "    warmup_ratio=0.1,\n",
-    "    lr_scheduler_type='cosine',\n",
-    "    report_to='tensorboard', \n",
-    "    max_steps=-1,\n",
-    "    seed=42,\n",
-    "    overwrite_output_dir=True,\n",
-    "    remove_unused_columns=True\n",
-    ")\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "432572ff",
-   "metadata": {},
-   "source": [
-    "## Setting the Supervised Finetuning Trainer (`SFTTrainer`)\n",
-    "    \n",
-    "This `SFTTrainer` is a wrapper around the `transformers.Trainer` class and inherits all of its attributes and methods.\n",
-    "The trainer takes care of properly initializing the `PeftModel`.   \n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e5c50c4d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "from trl import SFTTrainer\n",
-    "\n",
-    "trainer = SFTTrainer(\n",
-    "    model=model,\n",
-    "    args=args,\n",
-    "    train_dataset=dataset,\n",
-    "    peft_config=peft_config,\n",
-    "    max_seq_length=2048,\n",
-    "    tokenizer=tokenizer,\n",
-    "    packing=True,\n",
-    "    dataset_kwargs={'add_special_tokens': False, 'append_concat_token': False}\n",
-    ")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f6372ddf",
-   "metadata": {},
-   "source": [
-    "### Starting Training and Saving Model/Tokenizer\n",
-    "\n",
-    "We start training the model by calling the `train()` method on the trainer instance. This will start the training \n",
-    "loop and train the model for `15 epochs`. The model will be automatically saved to the output directory (**'temp_/tmp/model'**)\n",
-    "and to the hub in **'User//tmp/model'**. \n",
-    "  \n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "90ea4297",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "\n",
-    "model.config.use_cache = False\n",
-    "\n",
-    "# start training\n",
-    "trainer.train()\n",
-    "\n",
-    "# save the peft model\n",
-    "trainer.save_model()\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f2ae5eb4",
-   "metadata": {},
-   "source": [
-    "### Free the GPU Memory to Prepare Merging `LoRA` Adapters with the Base Model\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a7524f47",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "\n",
-    "# Free the GPU memory\n",
-    "del model\n",
-    "del trainer\n",
-    "torch.cuda.empty_cache()\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a67f6349",
-   "metadata": {},
-   "source": [
-    "## Merging LoRA Adapters into the Original Model\n",
-    "\n",
-    "While utilizing `LoRA`, we focus on training the adapters rather than the entire model. Consequently, during the \n",
-    "model saving process, only the `adapter weights` are preserved, not the complete model. If we wish to save the \n",
-    "entire model for easier usage with Text Generation Inference, we can incorporate the adapter weights into the model \n",
-    "weights. This can be achieved using the `merge_and_unload` method. Following this, the model can be saved using the \n",
-    "`save_pretrained` method. The result is a default model that is ready for inference.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ba375754",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "import torch\n",
-    "from peft import AutoPeftModelForCausalLM\n",
-    "\n",
-    "# Load Peft model on CPU\n",
-    "model = AutoPeftModelForCausalLM.from_pretrained(\n",
-    "    \"temp_/tmp/model\",\n",
-    "    torch_dtype=torch.float16,\n",
-    "    low_cpu_mem_usage=True\n",
-    ")\n",
-    "    \n",
-    "# Merge LoRA with the base model and save\n",
-    "merged_model = model.merge_and_unload()\n",
-    "merged_model.save_pretrained(\"/tmp/model\", safe_serialization=True, max_shard_size=\"2GB\")\n",
-    "tokenizer.save_pretrained(\"/tmp/model\")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1cdbbc86",
-   "metadata": {},
-   "source": [
-    "### Copy all result folders from 'temp_/tmp/model' to '/tmp/model'"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "3254f6a5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "import os\n",
-    "import shutil\n",
-    "\n",
-    "source_folder = \"temp_/tmp/model\"\n",
-    "destination_folder = \"/tmp/model\"\n",
-    "os.makedirs(destination_folder, exist_ok=True)\n",
-    "for item in os.listdir(source_folder):\n",
-    "    item_path = os.path.join(source_folder, item)\n",
-    "    if os.path.isdir(item_path):\n",
-    "        destination_path = os.path.join(destination_folder, item)\n",
-    "        shutil.copytree(item_path, destination_path)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "38468ef8",
-   "metadata": {},
-   "source": [
-    "### Generating a model card (README.md)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "aea8f916",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "card = '''\n",
-    "---\n",
-    "license: apache-2.0\n",
-    "tags:\n",
-    "- generated_from_trainer\n",
-    "- mistralai/Mistral\n",
-    "- PyTorch\n",
-    "- transformers\n",
-    "- trl\n",
-    "- peft\n",
-    "- tensorboard\n",
-    "base_model: mistralai/Mistral-[]\n",
-    "widget:\n",
-    "  - example_title: Pirate!\n",
-    "    messages:\n",
-    "      - role: system\n",
-    "        content: You are a pirate chatbot who always responds with Arr!\n",
-    "      - role: user\n",
-    "        content: \"There's a llama on my lawn, how can I get rid of him?\"\n",
-    "    output:\n",
-    "      text: >-\n",
-    "        Arr! 'Tis a puzzlin' matter, me hearty! A llama on yer lawn be a rare\n",
-    "        sight, but I've got a plan that might help ye get rid of 'im. Ye'll need\n",
-    "        to gather some carrots and hay, and then lure the llama away with the\n",
-    "        promise of a tasty treat. Once he's gone, ye can clean up yer lawn and\n",
-    "        enjoy the peace and quiet once again. But beware, me hearty, for there\n",
-    "        may be more llamas where that one came from! Arr!\n",
-    "model-index:\n",
-    "- name: /tmp/model\n",
-    "  results: []\n",
-    "datasets:\n",
-    "- HuggingFaceH4/ultrachat_200k\n",
-    "language:\n",
-    "- en\n",
-    "pipeline_tag: text-generation\n",
-    "---\n",
-    "\n",
-    "# Model Card for /tmp/model:\n",
-    "\n",
-    "**/tmp/model** is a language model that is trained to act as helpful assistant. It is a finetuned version of [mistralai/Mistral-[]](https://huggingface.co/mistralai/Mistral-[]) that was trained using `SFTTrainer` on publicly available dataset [\n",
-    "HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k).\n",
-    "\n",
-    "## Training Procedure:\n",
-    "\n",
-    "The training code used to create this model was generated by [Menouar/LLM-FineTuning-Notebook-Generator](https://huggingface.co/spaces/Menouar/LLM-FineTuning-Notebook-Generator).\n",
-    "\n",
-    "\n",
-    "\n",
-    "## Training hyperparameters\n",
-    "\n",
-    "The following hyperparameters were used during the training:\n",
-    "\n",
-    "\n",
-    "'''\n",
-    "\n",
-    "with open(\"/tmp/model/README.md\", \"w\") as f:\n",
-    "    f.write(card)\n",
-    "\n",
-    "args_dict = vars(args)\n",
-    "\n",
-    "with open(\"/tmp/model/README.md\", \"a\") as f:\n",
-    "    for k, v in args_dict.items():\n",
-    "        f.write(f\"- {k}: {v}\")\n",
-    "        f.write(\"\\n \\n\")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "9088a475",
-   "metadata": {},
-   "source": [
-    "## Login to HF"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c3359fe7",
-   "metadata": {},
-   "source": [
-    "Replace `HF_TOKEN` with a valid token in order to push **'/tmp/model'** to `huggingface_hub`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "4af7ed8e",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "# Install huggingface_hub\n",
-    "!pip install -q huggingface_hub\n",
-    "    \n",
-    "from huggingface_hub import login\n",
-    "    \n",
-    "login(\n",
-    "        token='HF_TOKEN',\n",
-    "        add_to_git_credential=True\n",
-    ")\n",
-    "    "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "08681a27",
-   "metadata": {},
-   "source": [
-    "## Pushing '/tmp/model' to the Hugging Face account."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "9a3930f9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "\n",
-    "from huggingface_hub import HfApi, HfFolder, Repository\n",
-    "\n",
-    "# Instantiate the HfApi class\n",
-    "api = HfApi()\n",
-    "\n",
-    "# Our Hugging Face repository\n",
-    "repo_name = \"/tmp/model\"\n",
-    "\n",
-    "# Create a repository on the Hugging Face Hub\n",
-    "repo = api.create_repo(token=HfFolder.get_token(), repo_type=\"model\", repo_id=repo_name)\n",
-    "\n",
-    "api.upload_folder(\n",
-    "    folder_path=\"/tmp/model\",\n",
-    "    repo_id=repo.repo_id\n",
-    ")\n"
-   ]
-  }
- ],
- "metadata": {
-  "language_info": {
-   "name": "python"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}