Finetuned Llama 3.1 Instruct model with knowledge distillation
specifically for expertise on AMD technologies and python coding.
Model Description
This is the model card of a 🤗 transformers model that has been
pushed on the Hub.
- Developed by: David Silverstein
- Language(s) (NLP): English, Python
- License: Free to use under Llama 3.1 licensing terms without warranty
- Finetuned from model meta-llama/Meta-Llama-3.1-8B-Instruct
Model Sources [optional]
- Repository: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
Can be used as a development assistant when using AMD technologies and python
in on-premise environments.
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and
limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = 'davidsi/Llama3_1-8B-Instruct-AMD-python'
tokenizer = AutoTokenizer.from_pretrained(model_name)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16).to(device)
messages = [
{"role": "system", "content": "You are a helpful assistant for AMD technologies and python."},
{"role": "user", "content": query}
]
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(device)
outputs = model.generate(
input_ids,
max_new_tokens=8192,
eos_token_id=terminators,
pad_token_id=tokenizer.eos_token_id,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
Training Details
Torchtune was used for full finetuning, for 5 epochs on a single Instinct MI210 GPU.
The training set consisted of 1658 question/answer pairs in Alpaca format.
Training Data
[More Information Needed]
Training Hyperparameters
- Training regime: [bf16 non-mixed precision]
Evaluation
Testing Data, Factors & Metrics
Testing Data
Model Architecture and Objective
This model is a finetuned version of Llama 3.1, which is an auto-regressive language
model that uses an optimized transformer architecture.
- Downloads last month
- 31