{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": { "id": "Q-bj6K7Qv4ft" }, "source": [ "# Instruction-Tuning a Generative Pretrained Transformer\n", "\n", "1. First, intall `transformers` library." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install transformers" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "- Load the dataset.\n", "- Load GPT-2 Tokenizer and add the choosen special tokens (`'<|startoftext|>', '<|endoftext|>','<|pad|>'`)\n", "- Create demonstrations by prepending the special tokens.\n", "- Calculate the maximum lenght (in tokens) that the demonstrations have." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 423 }, "id": "7MbpXGu-v4f1", "outputId": "2f764046-c977-4187-c9fc-5a37eb6ff28b" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Found cached dataset parquet (C:/Users/CWLINK/.cache/huggingface/datasets/nicholasKluge___parquet/nicholasKluge--fine-tuning-instruct-aira-c7a6e731d782bc09/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "75b57109c2544501a6f6eac494b6ba0b", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/4 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
promptcompletion
0I was wondering if you could walk me through t...Sure! The process for setting up a hydroponic ...
1What type of wine goes best with steak.The best type of wine to pair with steak depen...
2How do I know if this is a good investment.Answer: To determine if an investment is a goo...
3Please provide me with some financial advice.Some financial advice is to always pay yoursel...
4What kind of safety devices do I need to insta...The type of safety devices you should install ...
.........
42869How do computers communicate and network with ...Computers communicate and network with each ot...
42870How are websites different from web applications?Websites and web applications are similar in t...
42871What is open-source software and its benefits?Open-source software is software that is made ...
42872What is a cookie and how is it used in web bro...A cookie is a small piece of data that a websi...
42873What is cloud storage and its advantages for d...Cloud storage is a service that allows you to ...
\n", "

42874 rows × 2 columns

\n", "" ], "text/plain": [ " prompt \\\n", "0 I was wondering if you could walk me through t... \n", "1 What type of wine goes best with steak. \n", "2 How do I know if this is a good investment. \n", "3 Please provide me with some financial advice. \n", "4 What kind of safety devices do I need to insta... \n", "... ... \n", "42869 How do computers communicate and network with ... \n", "42870 How are websites different from web applications? \n", "42871 What is open-source software and its benefits? \n", "42872 What is a cookie and how is it used in web bro... \n", "42873 What is cloud storage and its advantages for d... \n", "\n", " completion \n", "0 Sure! The process for setting up a hydroponic ... \n", "1 The best type of wine to pair with steak depen... \n", "2 Answer: To determine if an investment is a goo... \n", "3 Some financial advice is to always pay yoursel... \n", "4 The type of safety devices you should install ... \n", "... ... \n", "42869 Computers communicate and network with each ot... \n", "42870 Websites and web applications are similar in t... \n", "42871 Open-source software is software that is made ... \n", "42872 A cookie is a small piece of data that a websi... \n", "42873 Cloud storage is a service that allows you to ... \n", "\n", "[42874 rows x 2 columns]" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "21930e5b069c42eba7b67aa9323901c4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | 0.00/1.04M [00:00', \n", " eos_token='<|endoftext|>', \n", " pad_token='<|pad|>')\n", "\n", "# create column that concatenates the two sentences\n", "df['demonstrations'] = tokenizer.bos_token + df['prompt'] + tokenizer.eos_token + df['completion'] + tokenizer.eos_token\n", "\n", "# calculate the length of the text\n", "df['length'] = df['demonstrations'].apply(lambda x: len(tokenizer.encode(x)))\n", "\n", "print(\"Total number of demonstrations: \", len(df))\n", "print(f\"The longest demonstration is {df['length'].max()} tokens long.\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "2. Create the Dataset class." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "id": "WlbAfMQ4v4gA" }, "outputs": [], "source": [ "import torch\n", "from torch.utils.data import Dataset\n", "\n", "max_length = 300\n", "\n", "class DemoDataset(Dataset):\n", "\n", " def __init__(self, demonstrations, tokenizer, max_length=max_length):\n", "\n", " self.tokenizer = tokenizer\n", " self.input_ids = []\n", " self.attn_masks = []\n", "\n", " for demo in demonstrations:\n", "\n", " encodings_dict = tokenizer(demo, \n", " truncation=True, \n", " max_length=max_length, \n", " padding=\"max_length\")\n", " \n", " self.input_ids.append(torch.tensor(encodings_dict['input_ids']))\n", " self.attn_masks.append(torch.tensor(encodings_dict['attention_mask']))\n", " \n", " def __len__(self):\n", " return len(self.input_ids)\n", "\n", " def __getitem__(self, idx):\n", " return self.input_ids[idx], self.attn_masks[idx] " ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "3. Split the data into training ad validation splits." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "-IOfa2PEv4gD", "outputId": "151f4cae-32f2-45d6-f205-60302dc7de5b" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of training samples: 38,586\n", "Number of validation samples: 4,288\n" ] } ], "source": [ "from torch.utils.data import random_split\n", "\n", "dataset = DemoDataset(df.demonstrations.to_list(), tokenizer, max_length=max_length)\n", "\n", "train_size = int(0.9 * len(dataset))\n", "val_size = len(dataset) - train_size\n", "\n", "train_dataset, val_dataset = random_split(dataset, [train_size, val_size])\n", "\n", "print('Number of training samples: {:,}'.format(train_size))\n", "print('Number of validation samples: {:,}'.format(val_size))" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "4. Create the `DataLoaders`." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "cUkCNV-6v4gG" }, "outputs": [], "source": [ "from torch.utils.data import DataLoader, RandomSampler, SequentialSampler\n", "\n", "train_dataloader = DataLoader(\n", " train_dataset, \n", " sampler=RandomSampler(train_dataset),\n", " batch_size=32\n", " )\n", "\n", "# validation data loader doesn't need randomization\n", "validation_dataloader=DataLoader(\n", " val_dataset, \n", " sampler=SequentialSampler(val_dataset), \n", " batch_size=32 \n", " )" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "5. load the base model." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 550, "referenced_widgets": [ "a7bb2d4cc1f74f33a62466e130d3c1a9", "9706bd5a6d164205989e96923280c522", "ef7694f2925d4422804baaede5c64c0b", "ef8d6e8fb2d54711bac0321e577ce9a5", "b629c8287f904d5c904f19d148ff663e", "7b95ba7da67a413f8e45864ca909cd6f", "681e6b5e2a484e63aabe7f9fb2767ad4", "55cecbcb63304d9f80f4931a368232e0", "5be34fa7a15a441d96b20daef61c6891", "d8c1b2a4ddc8416aa834c9379b307d4c", "e7004f2f677f4401bfbf729720ca987e", "567c00bdbe7f45cab7e48eca1a86ee62", "09570e24b3874bb99eed96409166e9e7", "f803d30823a64323886d2fd5315858ca", "db12c8a901d644c0b9aac03b7f1d37a4", "cebda9f32b0d4519b7bebfbf67ac410e", "0b8dced3f8bc4172a46c93fccca737eb", "e8b56ed15a48482096142f730bb248f3", "5d489be7701f4057874e80f081f04c1d", "7574f84c6d3c484cb1d07f417c47ba98", "4a85a12703a04dc8acb6808e520c9121", "32fdf73a7b36404fbe56e5972b868291" ] }, "id": "Rmg-5YJqv4gH", "outputId": "3e615596-cbae-41a7-b2a1-a8fd9fca9527" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a7bb2d4cc1f74f33a62466e130d3c1a9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading pytorch_model.bin: 0%| | 0.00/548M [00:00 model.config.n_layer - UNFREEZE_LAST_N:\n", " for parameter in m.parameters():\n", " parameter.requires_grad = True \n", "\n", " for parameter in model.transformer.ln_f.parameters(): \n", " parameter.requires_grad = True\n", "\n", " for parameter in model.lm_head.parameters(): \n", " parameter.requires_grad = True\n", "\n", "# Count the number of frozen and trainable layers\n", "num_frozen_layers = sum(1 for parameter in model.parameters() if not parameter.requires_grad)\n", "num_trainable_layers = sum(1 for parameter in model.parameters() if parameter.requires_grad)\n", "\n", "print(\"Number of frozen layers:\", num_frozen_layers)\n", "print(\"Number of trainable layers:\", num_trainable_layers)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "6. Setting training parameters." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "id": "qlbLg6tqv4gI" }, "outputs": [], "source": [ "from transformers import get_linear_schedule_with_warmup\n", "\n", "epochs = 5\n", "\n", "warmup_steps = 1e2\n", "\n", "sample_every = 100 # generate a sample every 100 batches\n", "\n", "optimizer = torch.optim.AdamW(model.parameters(), lr = 5e-4, eps = 1e-8)\n", "\n", "total_steps = len(train_dataloader) * epochs\n", "\n", "scheduler = get_linear_schedule_with_warmup(optimizer, \n", " num_warmup_steps = warmup_steps, \n", " num_training_steps = total_steps)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "7. Training/Validation loop." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "_X_m8XOtv4gR", "outputId": "a234e6b3-ef01-4bc9-a2ed-c82a794e022f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Beginning epoch 1 of 5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 35%|███▌ | 100/283 [01:00<01:43, 1.77it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 100 of 283. Loss:0.92487633228302.\n", "Example output: How does deep learning impact privacy and safeguarding of data?Deep learning enables a model to perform tasks in a manner that is both deterministic and intelligent: it employs an adversarial approach in such a way that is often transparent, but is not biased in its results. It has advantages over natural language processing.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 71%|███████ | 200/283 [01:58<00:47, 1.76it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 200 of 283. Loss:0.7599731683731079.\n", "Example output: What are the main applications of machine learning?Machine learning encompasses tasks such as identifying and extracting patterns in data, analyzing data, and analyzing patterns in the data. It also includes several computational tasks such as segmentation, clustering, and machine learning algorithms that aim to gather information across several dimensions.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 283/283 [02:46<00:00, 1.70it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Average Training Loss: 1.7797733711690868.\n", "Validation loss: 0.6214480362832546.\n", "Beginning epoch 2 of 5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 35%|███▌ | 100/283 [00:56<01:43, 1.77it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 100 of 283. Loss:0.6944515109062195.\n", "Example output: What are some of the major debates in the field of epistemology?Epistemology is an intriguing field of study that centers around the study of the epistemology of texts. Scholars and philosophers from various religions, from Plato to Xenophon have explored this field in tandem. There have also been several controversies surrounding the extent to which certain individuals can make substantial contributions to the study of epistemology, including those of Socrates and Marcus Aurelius. Epistemology offers a fascinating exploration of one of the primary strands of social science, namely the role of the mind in epistemology.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 71%|███████ | 200/283 [01:55<00:47, 1.76it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 200 of 283. Loss:0.6129528284072876.\n", "Example output: Can you explain the origin of Occam's razor?The ancient Greek philosopher Zeno of Citium, known as Stoics, was the first to argue for the existence of objective reality. In his treatise on the subject, he presented the threefold nature of things, namely, the objective nature, relative knowledge, and personal perception.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 283/283 [02:42<00:00, 1.74it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Average Training Loss: 0.598113334221048.\n", "Validation loss: 0.5255831023678184.\n", "Beginning epoch 3 of 5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 35%|███▌ | 100/283 [00:56<01:43, 1.77it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 100 of 283. Loss:0.5187301635742188.\n", "Example output: How can we ensure that AI does not compromise the privacy of children and adolescents?AI technologies should be developed to ensure that they comply with human rights, which include Child and Adolescent Rights. The primary objective of this principle is to establish ethical standards and standards for AI development and use.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 71%|███████ | 200/283 [01:53<00:47, 1.77it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 200 of 283. Loss:0.49698150157928467.\n", "Example output: How far are we from achieving general artificial intelligence?The realm of General Intelligence encompasses the capability of AGI to undertake any task that can be undertaken by any intelligent entity, including, but not limited to, Artificial General Intelligence (AGI).\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 283/283 [02:41<00:00, 1.76it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Average Training Loss: 0.5052632845332682.\n", "Validation loss: 0.48149530217051506.\n", "Beginning epoch 4 of 5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 35%|███▌ | 100/283 [00:56<01:43, 1.77it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 100 of 283. Loss:0.4454309046268463.\n", "Example output: How does Goodhart's law work?In the realm of machine learning, Goodhart's Law stands as one of three noteworthy law-based optimization techniques, along with Alignment Daemons, which have been instrumental in achieving significant benefits for machine learning.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 71%|███████ | 200/283 [01:53<00:47, 1.77it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 200 of 283. Loss:0.4759449064731598.\n", "Example output: In what ways does the creative process get impacted by deep learning?The realm of deep learning is dedicated to the exploration of novel means of creating engaging and immersive artworks. The ability to imbibe these outputs imbued with profound emotional and ethical significance enables individuals to imbibe these exceptional creations with greater comprehension and comprehension.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 283/283 [02:41<00:00, 1.75it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Average Training Loss: 0.45279810411770016.\n", "Validation loss: 0.46003800816833973.\n", "Beginning epoch 5 of 5\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 35%|███▌ | 100/283 [00:56<01:43, 1.76it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 100 of 283. Loss:0.4711158275604248.\n", "Example output: How does a feedforward neural network work?An FNN is a type of neural network that lacks cyclic or recursive connections, unlike the recurrent neural networks of RNNs. The FNN's main purpose is to form a graph over a time sequence, then display dynamic behavior using the feedforward pass through it. This allows the network to exhibit dynamic behavior that is proportional to the number of folds in the graph. The Feedforward Neural Network (FNN) is an example of a FNN, famously known as Inner-Neural Zoo.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ " 71%|███████ | 200/283 [01:54<00:47, 1.77it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Batch 200 of 283. Loss:0.4309981167316437.\n", "Example output: How does Aumann's agreement theorem relate to Nash Equilibrium?According to Aumann's Agreement Theorem, two rational agents who share common principles are obligated to act in accordance with one another's beliefs, and this means they must inevitably come to a mutual agreement.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 283/283 [02:42<00:00, 1.75it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Average Training Loss: 0.42189987072253815.\n", "Validation loss: 0.4528817282989621.\n" ] } ], "source": [ "import tqdm\n", "\n", "training_stats = []\n", "\n", "for epoch_i in range(0, epochs):\n", "\n", " print(f'Beginning epoch {epoch_i + 1} of {epochs}')\n", "\n", " total_train_loss = 0\n", "\n", " model.train()\n", "\n", " for step, batch in enumerate(tqdm.tqdm(train_dataloader)):\n", "\n", " b_input_ids = batch[0].to(device)\n", " b_labels = batch[0].to(device)\n", " b_masks = batch[1].to(device)\n", "\n", " model.zero_grad() \n", "\n", " outputs = model(b_input_ids,\n", " labels=b_labels, \n", " attention_mask = b_masks,\n", " token_type_ids=None)\n", "\n", " loss = outputs[0] \n", "\n", " batch_loss = loss.item()\n", " total_train_loss += batch_loss\n", "\n", " # Sample every 100 batches\n", " if step % sample_every == 0 and not step == 0:\n", "\n", " print(f'Batch {step} of {len(train_dataloader)}. Loss:{batch_loss}.')\n", "\n", " model.eval()\n", "\n", " inputs = tokenizer(tokenizer.bos_token + df.prompt.sample().iloc[0] + tokenizer.eos_token, return_tensors=\"pt\").to(device)\n", "\n", " sample_outputs = model.generate(**inputs,\n", " bos_token_id=tokenizer.bos_token_id,\n", " pad_token_id=tokenizer.pad_token_id,\n", " eos_token_id=tokenizer.eos_token_id,\n", " do_sample=True, \n", " top_k=50, \n", " max_length = 200,\n", " top_p=0.95, \n", " num_return_sequences=1)\n", " \n", " for i, sample_output in enumerate(sample_outputs):\n", " print(f'Example output: {tokenizer.decode(sample_output, skip_special_tokens=True)}')\n", " \n", " model.train()\n", "\n", " loss.backward()\n", "\n", " optimizer.step()\n", "\n", " scheduler.step()\n", "\n", " avg_train_loss = total_train_loss / len(train_dataloader) \n", "\n", " print(f'Average Training Loss: {avg_train_loss}.')\n", "\n", " model.eval()\n", "\n", " total_eval_loss = 0\n", " nb_eval_steps = 0\n", "\n", " for batch in validation_dataloader:\n", " \n", " b_input_ids = batch[0].to(device)\n", " b_labels = batch[0].to(device)\n", " b_masks = batch[1].to(device)\n", " \n", " with torch.no_grad(): \n", "\n", " outputs = model(b_input_ids, \n", " attention_mask = b_masks,\n", " labels=b_labels)\n", " \n", " loss = outputs[0] \n", " \n", " batch_loss = loss.item()\n", " total_eval_loss += batch_loss \n", "\n", " avg_val_loss = total_eval_loss / len(validation_dataloader)\n", "\n", " print(f'Validation loss: {avg_val_loss}.')\n", "\n", " training_stats.append(\n", " {\n", " 'epoch': epoch_i + 1,\n", " 'Training Loss': avg_train_loss,\n", " 'Valid. Loss': avg_val_loss,\n", " }\n", " )\n", "\n", "print(\"Training complete!\")\n", "\n", "df_stats = pd.DataFrame(data=training_stats)\n", "df_stats = df_stats.set_index('epoch')\n", "df_stats.to_parquet(\"./training_stats.parquet\", compression=\"gzip\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "8. Plotting Learning Curves." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 446 }, "id": "J1-hAY9Av4gT", "outputId": "799f3284-c8e7-46f4-e7f1-55638a6d81bc" }, "outputs": [ { "data": { "text/markdown": [ "| epoch | Training Loss | Valid. Loss |\n", "|--------:|----------------:|--------------:|\n", "| 1 | 0.852566 | 0.574185 |\n", "| 2 | 0.549692 | 0.544331 |\n", "| 3 | 0.481956 | 0.53601 |" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import pandas as pd\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "\n", "df_stats = pd.read_parquet(\"./training_stats.parquet\")\n", "\n", "display(Markdown(df_stats.to_markdown()))\n", "# Use plot styling from seaborn.\n", "sns.set(style='darkgrid')\n", "\n", "# Increase the plot size and font size.\n", "sns.set(font_scale=1.5)\n", "plt.rcParams[\"figure.figsize\"] = (12,6)\n", "\n", "# Plot the learning curve.\n", "plt.plot(df_stats['Training Loss'], 'b-o', label=\"Training\")\n", "plt.plot(df_stats['Valid. Loss'], 'g-o', label=\"Validation\")\n", "\n", "# Label the plot.\n", "plt.title(\"Training & Validation Loss\")\n", "plt.xlabel(\"Epoch\")\n", "plt.ylabel(\"Loss\")\n", "plt.legend()\n", "plt.xticks([1, 2, 3, 4, 5])\n", "\n", "plt.show()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "9. Saving te model." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "id": "53EiesJwv4gU" }, "outputs": [], "source": [ "import os\n", "\n", "output_dir = 'your_directory_here'\n", "\n", "# Save a trained model, configuration and tokenizer using `save_pretrained()`.\n", "# They can then be reloaded using `from_pretrained()`\n", "model_to_save = model.module if hasattr(model, 'module') else model\n", "model_to_save.save_pretrained(output_dir)\n", "tokenizer.save_pretrained(output_dir)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "10. Test the model." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Question: 👤 What time is now?\n", "\n" ] }, { "data": { "text/html": [ "Response 1: 🤖 Regrettably, I am unable to provide a timekeeping service for your computer." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Response 2: 🤖 Regrettably, I do not possess the ability to provide you with the response you seek. As an artificial intelligence language model, my abilities are confined to that of a conversational search engine. My primary function is to assist you with your inquiries and provide useful information. If you have any further inquiries or require assistance with a different matter, I am available to assist you in your search." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Response 3: 🤖 I assume you are currently at the current time." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from transformers import GPT2Tokenizer, GPT2LMHeadModel\n", "from IPython.display import HTML\n", "import torch\n", "\n", "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n", "\n", "tokenizer = GPT2Tokenizer.from_pretrained('nicholasKluge/Aira-Instruct-124M')\n", "aira = GPT2LMHeadModel.from_pretrained('nicholasKluge/Aira-Instruct-124M') \n", "\n", "aira.to(device)\n", "aira.eval()\n", "\n", "question = input(\"Enter your question: \")\n", "\n", "inputs = tokenizer(tokenizer.bos_token + question + tokenizer.eos_token, return_tensors=\"pt\").to(device)\n", "\n", "responses = aira.generate(**inputs,\n", " bos_token_id=tokenizer.bos_token_id,\n", " pad_token_id=tokenizer.pad_token_id,\n", " eos_token_id=tokenizer.eos_token_id,\n", " do_sample=True, \n", " top_k=50, \n", " max_length=200,\n", " top_p=0.95,\n", " temperature=0.7, \n", " num_return_sequences=3)\n", "\n", "print(f\"Question: 👤 {question}\\n\")\n", "\n", "for i, response in enumerate(responses):\n", "\n", " # print only the response and remove the question\n", " display(HTML(f'Response {i+1}: 🤖 {tokenizer.decode(response, skip_special_tokens=True).replace(question, \"\")}'))" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Done 🤗" ] } ], "metadata": { "accelerator": "GPU", "colab": { "gpuType": "A100", "machine_shape": "hm", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "orig_nbformat": 4, "widgets": { "application/vnd.jupyter.widget-state+json": { "09570e24b3874bb99eed96409166e9e7": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_0b8dced3f8bc4172a46c93fccca737eb", "placeholder": "​", "style": "IPY_MODEL_e8b56ed15a48482096142f730bb248f3", "value": "Downloading (…)neration_config.json: 100%" } }, "0a323535eb0b47ac8fd9a990fdc5d089": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "0b8dced3f8bc4172a46c93fccca737eb": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "1184f98b345e4702999cae47b8cc35bc": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_a8b98fc646244db3a0bdc30439eb2424", "placeholder": "​", "style": "IPY_MODEL_ff75df2d1bec41ed864bff7caa030755", "value": "Downloading (…)olve/main/merges.txt: 100%" } }, "213c5a7614e44882b9ca57a45be83e9c": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_a1a4435eef794b52ac1e7fc1b25f6c06", "placeholder": "​", "style": "IPY_MODEL_a2469a2f4c7a4691a2effec853e06c45", "value": " 1.04M/1.04M [00:00<00:00, 5.20MB/s]" } }, "32fdf73a7b36404fbe56e5972b868291": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "3a422de2c36a404db72dcf73dfdf8a49": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_be6bc5e6b88742f0967c7116fcb8e5bf", "placeholder": "​", "style": "IPY_MODEL_c4ccef750b32416fb40481bbe3154999", "value": "Downloading (…)lve/main/config.json: 100%" } }, "3f20e57a12354850a04214035a1caa0d": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_5806af5745314c088e1ab7d8b6d6ebf4", "max": 456318, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_f032b4ef03e24950b6c103db75fe0b3f", "value": 456318 } }, "3fa1b758af9745ba9fbb74ec28428c02": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_3a422de2c36a404db72dcf73dfdf8a49", "IPY_MODEL_bf45b6222dde45ffb49de8cca8a1bc7e", "IPY_MODEL_452924a5da7e42abafc229cd3227a903" ], "layout": "IPY_MODEL_d0ce04661f204129b35cf3edb503d8e1" } }, "40c2db61393943d6a25c31aaf0b19dad": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "42f9ebb1536b4d73ac126a49035fd54a": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "452924a5da7e42abafc229cd3227a903": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_947715586fb843819dc016eb87aca236", "placeholder": "​", "style": "IPY_MODEL_bbe4b358ce6d45e2af67c243ab5c4297", "value": " 665/665 [00:00<00:00, 54.4kB/s]" } }, "4a85a12703a04dc8acb6808e520c9121": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "5280c947d92a42ff9c81f04b05f8b01d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "55cecbcb63304d9f80f4931a368232e0": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "567c00bdbe7f45cab7e48eca1a86ee62": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_09570e24b3874bb99eed96409166e9e7", "IPY_MODEL_f803d30823a64323886d2fd5315858ca", "IPY_MODEL_db12c8a901d644c0b9aac03b7f1d37a4" ], "layout": "IPY_MODEL_cebda9f32b0d4519b7bebfbf67ac410e" } }, "5806af5745314c088e1ab7d8b6d6ebf4": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "5be34fa7a15a441d96b20daef61c6891": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "5d489be7701f4057874e80f081f04c1d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "62b2821c929a41ceb050fde699cf6759": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "681e6b5e2a484e63aabe7f9fb2767ad4": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "6c37b4e7a676493fafae60d1af2134d5": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "73d25358055d4fa387d6986aeb2e2244": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_1184f98b345e4702999cae47b8cc35bc", "IPY_MODEL_3f20e57a12354850a04214035a1caa0d", "IPY_MODEL_89986188d8a34ba5bf7321306680e3e2" ], "layout": "IPY_MODEL_42f9ebb1536b4d73ac126a49035fd54a" } }, "7574f84c6d3c484cb1d07f417c47ba98": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "772c278663014691833034b56968365a": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "7b95ba7da67a413f8e45864ca909cd6f": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "89986188d8a34ba5bf7321306680e3e2": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_dafc03cbc320485cac8ad104e9e182fa", "placeholder": "​", "style": "IPY_MODEL_6c37b4e7a676493fafae60d1af2134d5", "value": " 456k/456k [00:00<00:00, 3.42MB/s]" } }, "90dba2dc7fec4caaac7101a2bc399d63": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_62b2821c929a41ceb050fde699cf6759", "max": 1042301, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_f5c6544b91fc4d18bb2ca5a707d7c084", "value": 1042301 } }, "947715586fb843819dc016eb87aca236": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "9706bd5a6d164205989e96923280c522": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_7b95ba7da67a413f8e45864ca909cd6f", "placeholder": "​", "style": "IPY_MODEL_681e6b5e2a484e63aabe7f9fb2767ad4", "value": "Downloading pytorch_model.bin: 100%" } }, "9f34f4ad71f04b64905d9b25f2923ff9": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_b0821e0aebd94a37a4e4c7c4472d64fb", "IPY_MODEL_90dba2dc7fec4caaac7101a2bc399d63", "IPY_MODEL_213c5a7614e44882b9ca57a45be83e9c" ], "layout": "IPY_MODEL_0a323535eb0b47ac8fd9a990fdc5d089" } }, "a1a4435eef794b52ac1e7fc1b25f6c06": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "a2469a2f4c7a4691a2effec853e06c45": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "a7bb2d4cc1f74f33a62466e130d3c1a9": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_9706bd5a6d164205989e96923280c522", "IPY_MODEL_ef7694f2925d4422804baaede5c64c0b", "IPY_MODEL_ef8d6e8fb2d54711bac0321e577ce9a5" ], "layout": "IPY_MODEL_b629c8287f904d5c904f19d148ff663e" } }, "a8b98fc646244db3a0bdc30439eb2424": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "b0821e0aebd94a37a4e4c7c4472d64fb": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_5280c947d92a42ff9c81f04b05f8b01d", "placeholder": "​", "style": "IPY_MODEL_da2f398f990c4e19a0fa6945fcfe9bb0", "value": "Downloading (…)olve/main/vocab.json: 100%" } }, "b629c8287f904d5c904f19d148ff663e": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "bbe4b358ce6d45e2af67c243ab5c4297": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "be6bc5e6b88742f0967c7116fcb8e5bf": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "bf45b6222dde45ffb49de8cca8a1bc7e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_40c2db61393943d6a25c31aaf0b19dad", "max": 665, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_772c278663014691833034b56968365a", "value": 665 } }, "c4ccef750b32416fb40481bbe3154999": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "cebda9f32b0d4519b7bebfbf67ac410e": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "d0ce04661f204129b35cf3edb503d8e1": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "d8c1b2a4ddc8416aa834c9379b307d4c": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "da2f398f990c4e19a0fa6945fcfe9bb0": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "dafc03cbc320485cac8ad104e9e182fa": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "db12c8a901d644c0b9aac03b7f1d37a4": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_4a85a12703a04dc8acb6808e520c9121", "placeholder": "​", "style": "IPY_MODEL_32fdf73a7b36404fbe56e5972b868291", "value": " 124/124 [00:00<00:00, 9.08kB/s]" } }, "e7004f2f677f4401bfbf729720ca987e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "e8b56ed15a48482096142f730bb248f3": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "ef7694f2925d4422804baaede5c64c0b": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_55cecbcb63304d9f80f4931a368232e0", "max": 548118077, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_5be34fa7a15a441d96b20daef61c6891", "value": 548118077 } }, "ef8d6e8fb2d54711bac0321e577ce9a5": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d8c1b2a4ddc8416aa834c9379b307d4c", "placeholder": "​", "style": "IPY_MODEL_e7004f2f677f4401bfbf729720ca987e", "value": " 548M/548M [00:01<00:00, 293MB/s]" } }, "f032b4ef03e24950b6c103db75fe0b3f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "f5c6544b91fc4d18bb2ca5a707d7c084": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "f803d30823a64323886d2fd5315858ca": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_5d489be7701f4057874e80f081f04c1d", "max": 124, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_7574f84c6d3c484cb1d07f417c47ba98", "value": 124 } }, "ff75df2d1bec41ed864bff7caa030755": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } } } } }, "nbformat": 4, "nbformat_minor": 0 }