Model Card for Finetuned OPT-350M Chatbot Model
Model Details
Model Description
This is a chat fine-tuned version of facebook/opt-350m
, designed to provide chatbot-like responses using instruction fine-tuning techniques.
The goal of this tuning was to to convert a Base Model to Chat Model using Instruction Finetuning.
- Developed by: Sartaj
- Finetuned from model:
facebook/opt-350m
- Language(s): English
- License: apache-2.0
- Framework: Hugging Face Transformers
Model Sources
- Repository: facebook/opt-350m
- Paper: paper
Uses
Model can be used to generate basic code and further finetuned to refine code generation.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "sartajbhuvaji/facebook-opt-350m-chat"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
def generate_response(question):
input_prompt = f"### Question: {question}\n ### Answer:"
inputs = tokenizer(input_prompt, return_tensors="pt").to(device)
# Generate output using the model
outputs = model.generate(
inputs["input_ids"],
max_length=500,
num_beams=5,
temperature=0.7,
eos_token_id=tokenizer.eos_token_id,
early_stopping=True,
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
return generated_text
question = "Write a Python program to add two numbers."
response = generate_response(question)
print(response)
'''
### Question: Write a Python program to add two numbers.
### Answer: def add_two_numbers(a, b):
return a + b
'''
Downstream Use
- Code Geneation
- Fine Tuning
Training Details
Training Data
- Dataset : lucasmccabe-lmi/CodeAlpaca-20k
- Total Training Tokens : 2,202,939
Training Procedure
- Full Model Finetune
- Epochs : 3
Preprocessing
- Pre Processed data to follow template: ### Question: {quesion}\n ### Answer: {ansewer} {tokenizer.eos_token}
def formatting_prompts_func(example):
output_texts = []
for i in range(len(example['instruction'])):
text = f"### Question: {example['instruction'][i]}\n ### Answer: {example['output'][i]} {tokenizer.eos_token}"
output_texts.append(text)
return output_texts
Training Loss
Trainer
- global_step: 7509
- training_loss: 0.9127310856885068
- train_runtime: 2485.7984
- train_samples_per_second: 24.164
- train_steps_per_second: 3.021
- total_flos: 2.939309944327373e+16
- train_loss: 0.9127310856885068
- epoch: 3.0
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: NVIDIA A100 40 GB
- Hours used: ~10
- Cloud Provider: jetstream2
- Compute Region: USA
- Carbon Emitted: 2.24 Kg
- Downloads last month
- 110
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for sartajbhuvaji/facebook-opt-350m-chat
Base model
facebook/opt-350m