This model is created by finetuning EleutherAI/pythia-1.4b-deduped
on the Dahoas/synthetic-instruct-gptj-pairwise
.
You can try a demo of the model hosted on Lambda Cloud.
Model Details
- Finetuned by: Lambda
- Model type: Transformer-based Language Model
- Language: English
- Pre-trained model: EleutherAI/pythia-1.4b-deduped
- Dataset: Dahoas/synthetic-instruct-gptj-pairwise
- Library: transformers
- License: Apache 2.0
Prerequisites
Running inference with the model takes ~4GB of GPU memory.
Quick Start
import torch
from transformers import AutoTokenizer, pipeline, StoppingCriteria, StoppingCriteriaList
device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
model_name = "lambdalabs/pythia-1.4b-deduped-synthetic-instruct"
max_new_tokens = 2048
stop_token = "<|stop|>"
class KeywordsStoppingCriteria(StoppingCriteria):
def __init__(self, keywords_ids: list):
self.keywords = keywords_ids
def __call__(
self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs
) -> bool:
if input_ids[0][-1] in self.keywords:
return True
return False
tokenizer = AutoTokenizer.from_pretrained(
model_name,
)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.add_tokens([stop_token])
stop_ids = [tokenizer.encode(w)[0] for w in [stop_token]]
stop_criteria = KeywordsStoppingCriteria(stop_ids)
generator = pipeline(
"text-generation",
model=model_name,
device=device,
max_new_tokens=max_new_tokens,
torch_dtype=torch.float16,
stopping_criteria=StoppingCriteriaList([stop_criteria]),
)
example = "Can you give me some tips on how to save money every month."
text = "Question: {}\nAnswer:".format(example)
result = generator(
text,
num_return_sequences=1,
)
output = result[0]["generated_text"]
print(output)
Output:
Question: Can you give me some tips on how to save money every month.
Answer:Create a budget and track your spending.
2. Cut down on unnecessary expenses, such as eating out, shopping, and entertainment.
3. Make a list of your monthly expenses and stick to it.
4. Take advantage of discounts and coupons when shopping.
5. Make sure to pay your bills on time to avoid late fees.
6. Save a portion of your income each month by investing it in a high-yield savings account.
7. Consider automating your savings by setting up a recurring transfer from your checking to a savings account.
8. Take advantage of free entertainment opportunities, such as going to the park or museum.
9. Look for ways to save on utilities, such as installing energy-efficient appliances.
10. Research and use public transportation to save on gas.<|stop|>
Training
The model was trained on the Dahoas/synthetic-instruct-gptj-pairwise
. We split the original dataset into the train (first 32000 examples) and validation (the remaining 1144 examples) subsets.
We finetune the model for 4 epoches. This took 8xA100 80GB 2 hours, where we set batch_size_per_gpu
to 8
(so global batch size is 64), and learning rate to 0.00002
(with linear decay to zero at the last trainig step). You can find a Weights and Biases record here.
- Downloads last month
- 501
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.