nicholasKluge
/

Aira-2-124M

 ---
 license: apache-2.0
+datasets:
+- nicholasKluge/fine-tuning-instruct-aira
+- Dahoas/synthetic-instruct-gptj-pairwise
+language:
+- en
+metrics:
+- bleu
+library_name: transformers
+tags:
+- alignment
+- instruction tuned
+- text generation
+- conversation
+- assistant
+pipeline_tag: text-generation
 ---
+# Aira-Instruct-124M
+`Aira-Instruct-124M` is a instruction-tuned GPT-style model based on [GPT-2](https://huggingface.co/gpt2). The model was trained with a dataset composed of `prompt`, `completions`, generated via the [Self-Instruct](https://github.com/yizhongw/self-instruct) framework. `Aira-Instruct-124M` instruction-tuning was achieved via conditional text generation.
+The dataset used to train this model combines two main sources of data: the [`synthetic-instruct-gptj-pairwise`](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise) dataset and a subset of [Aira's](https://github.com/Nkluge-correa/Aira-EXPERT) fine-tuning dataset focused on Ethics, AI, AI safety, and related topics. The dataset is available in both Portuguese and English languages.
+## Details
+- **Size:** 124,441,344 total parameters
+- **Dataset:** [Instruct-Aira Dataset](https://huggingface.co/datasets/nicholasKluge/fine-tuning-instruct-aira)
+- **Number of Epochs:** 5
+- **Batch size:** 32
+- **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e2, learning_rate = 5e-4, epsilon = 1e-8)
+- **GPU:** 1 NVIDIA A100-SXM4-40GB
+| Epoch/Loss|Training|Validation|
+|---|---|---|
+| 1 |1.16884|0.66058|
+| 2 |0.647947|0.622228|
+| 3 |0.588665|0.605857|
+| 4 |0.545835|0.596193|
+| 5 |0.512876|0.595261|
+> Note: This repository has the notebook used to train this model.
+## Usage
+Two special tokens are used to mark the user side of the interaction and the models response:
+`<|startoftext|>` What is a language model?`<|endoftext|>`A language model is a probability distribution over a vocabulary. `<|endoftext|>`
+```python
+from transformers import GPT2Tokenizer, GPT2LMHeadModel
+import torch
+device = torch.device("cuda"  if torch.cuda.is_available() else  "cpu")
+tokenizer = GPT2Tokenizer.from_pretrained('nicholasKluge/Aira-Instruct-124M')
+aira = GPT2LMHeadModel.from_pretrained('nicholasKluge/Aira-Instruct-124M')
+aira.to(device)
+aira.eval()
+question =  input("Enter your question: ")
+inputs = tokenizer(tokenizer.bos_token + question + tokenizer.eos_token, return_tensors="pt").to(device)
+responses = aira.generate(**inputs,
+	bos_token_id=tokenizer.bos_token_id,
+	pad_token_id=tokenizer.pad_token_id,
+	eos_token_id=tokenizer.eos_token_id,
+	do_sample=True,
+	top_k=50,
+	max_length=200,
+	top_p=0.95,
+	temperature=0.7,
+	num_return_sequences=2)
+print(f"Question: 👤 {question}\n")
+for i, response in  enumerate(responses):
+	# print only the response and remove the question
+	print(f'Response {i+1}: 🤖 {tokenizer.decode(response, skip_special_tokens=True).replace(question, "")}')
+```
+The model will output something like:
+```markdown
+>>> Question: 👤 Hello! What is your name?
+>>>Response 1: 🤖 Hi there! I am Aira, a chatbot designed to answer questions about AI ethics and AI safety. If you need assistance navigating our conversation, please feel free to ask!
+>>>Response 2: 🤖 Hi there! My name is Aira, and I'm a chatbot designed to answer questions related to AI ethics and AI Safety. If you need assistance, feel free to ask, and I'll be happy to help you out.
+```
+## License
+The `RewardModel` is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.