nicholasKluge commited on
Commit
6e70a89
·
1 Parent(s): d43b85c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -0
README.md CHANGED
@@ -1,3 +1,95 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - nicholasKluge/fine-tuning-instruct-aira
5
+ - Dahoas/synthetic-instruct-gptj-pairwise
6
+ language:
7
+ - en
8
+ metrics:
9
+ - bleu
10
+ library_name: transformers
11
+ tags:
12
+ - alignment
13
+ - instruction tuned
14
+ - text generation
15
+ - conversation
16
+ - assistant
17
+ pipeline_tag: text-generation
18
  ---
19
+ # Aira-Instruct-124M
20
+
21
+ `Aira-Instruct-124M` is a instruction-tuned GPT-style model based on [GPT-2](https://huggingface.co/gpt2). The model was trained with a dataset composed of `prompt`, `completions`, generated via the [Self-Instruct](https://github.com/yizhongw/self-instruct) framework. `Aira-Instruct-124M` instruction-tuning was achieved via conditional text generation.
22
+
23
+ The dataset used to train this model combines two main sources of data: the [`synthetic-instruct-gptj-pairwise`](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise) dataset and a subset of [Aira's](https://github.com/Nkluge-correa/Aira-EXPERT) fine-tuning dataset focused on Ethics, AI, AI safety, and related topics. The dataset is available in both Portuguese and English languages.
24
+
25
+ ## Details
26
+
27
+ - **Size:** 124,441,344 total parameters
28
+ - **Dataset:** [Instruct-Aira Dataset](https://huggingface.co/datasets/nicholasKluge/fine-tuning-instruct-aira)
29
+ - **Number of Epochs:** 5
30
+ - **Batch size:** 32
31
+ - **Optimizer:** `torch.optim.AdamW` (warmup_steps = 1e2, learning_rate = 5e-4, epsilon = 1e-8)
32
+ - **GPU:** 1 NVIDIA A100-SXM4-40GB
33
+
34
+ | Epoch/Loss|Training|Validation|
35
+ |---|---|---|
36
+ | 1 |1.16884|0.66058|
37
+ | 2 |0.647947|0.622228|
38
+ | 3 |0.588665|0.605857|
39
+ | 4 |0.545835|0.596193|
40
+ | 5 |0.512876|0.595261|
41
+
42
+ > Note: This repository has the notebook used to train this model.
43
+
44
+ ## Usage
45
+
46
+ Two special tokens are used to mark the user side of the interaction and the models response:
47
+
48
+ `<|startoftext|>` What is a language model?`<|endoftext|>`A language model is a probability distribution over a vocabulary. `<|endoftext|>`
49
+
50
+ ```python
51
+ from transformers import GPT2Tokenizer, GPT2LMHeadModel
52
+ import torch
53
+
54
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
55
+
56
+ tokenizer = GPT2Tokenizer.from_pretrained('nicholasKluge/Aira-Instruct-124M')
57
+ aira = GPT2LMHeadModel.from_pretrained('nicholasKluge/Aira-Instruct-124M')
58
+
59
+ aira.to(device)
60
+ aira.eval()
61
+
62
+ question = input("Enter your question: ")
63
+
64
+ inputs = tokenizer(tokenizer.bos_token + question + tokenizer.eos_token, return_tensors="pt").to(device)
65
+
66
+ responses = aira.generate(**inputs,
67
+ bos_token_id=tokenizer.bos_token_id,
68
+ pad_token_id=tokenizer.pad_token_id,
69
+ eos_token_id=tokenizer.eos_token_id,
70
+ do_sample=True,
71
+ top_k=50,
72
+ max_length=200,
73
+ top_p=0.95,
74
+ temperature=0.7,
75
+ num_return_sequences=2)
76
+
77
+ print(f"Question: 👤 {question}\n")
78
+
79
+ for i, response in enumerate(responses):
80
+ # print only the response and remove the question
81
+ print(f'Response {i+1}: 🤖 {tokenizer.decode(response, skip_special_tokens=True).replace(question, "")}')
82
+ ```
83
+
84
+ The model will output something like:
85
+
86
+ ```markdown
87
+ >>> Question: 👤 Hello! What is your name?
88
+
89
+ >>>Response 1: 🤖 Hi there! I am Aira, a chatbot designed to answer questions about AI ethics and AI safety. If you need assistance navigating our conversation, please feel free to ask!
90
+ >>>Response 2: 🤖 Hi there! My name is Aira, and I'm a chatbot designed to answer questions related to AI ethics and AI Safety. If you need assistance, feel free to ask, and I'll be happy to help you out.
91
+ ```
92
+
93
+ ## License
94
+
95
+ The `RewardModel` is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.