File size: 3,739 Bytes
e4350be d57bab1 2910be8 a059f57 2910be8 e4350be 170bc6b 18ba7ce e4350be 6053454 f8e2814 e4350be d57bab1 18ba7ce 6053454 d57bab1 e4350be 18ba7ce e4350be 18ba7ce e4350be 6053454 d57bab1 43c5650 d57bab1 52dc519 a059f57 52dc519 a059f57 7563e6e a059f57 43c5650 e4350be 18ba7ce d57bab1 e4350be 6053454 e4350be 18ba7ce 43c5650 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
---
language:
- en
library_name: transformers
license: llama3.1
pipeline_tag: text-generation
---
<!-- Provide a quick summary of what the model is/does. -->
Finetuned Llama 3.1 Instruct model with knowledge distillation
specifically for expertise on AMD technologies and python coding.
### Model Description
<!-- Provide a longer summary of what this model is. -->
This is the model card of a 🤗 transformers model that has been
pushed on the Hub.
- **Developed by:** David Silverstein
- **Language(s) (NLP):** English, Python
- **License:** Free to use under Llama 3.1 licensing terms without warranty
- **Finetuned from model meta-llama/Meta-Llama-3.1-8B-Instruct**
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
Can be used as a development assistant when using AMD technologies and python
in on-premise environments.
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
[More Information Needed]
### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
Users (both direct and downstream) should be made aware of the risks, biases and
limitations of the model. More information needed for further recommendations.
## How to Get Started with the Model
Use the code below to get started with the model:
~~~
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = 'davidsi/Llama3_1-8B-Instruct-AMD-python'
tokenizer = AutoTokenizer.from_pretrained(model_name)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16).to(device)
messages = [
{"role": "system", "content": "You are a helpful assistant for AMD technologies and python."},
{"role": "user", "content": query}
]
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(device)
outputs = model.generate(
input_ids,
max_new_tokens=8192,
eos_token_id=terminators,
pad_token_id=tokenizer.eos_token_id,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
~~~
## Training Details
Torchtune was used for full finetuning, for 5 epochs on a single Instinct MI210 GPU.
The training set consisted of 1658 question/answer pairs in Alpaca format.
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
[More Information Needed]
#### Training Hyperparameters
- **Training regime:** [bf16 non-mixed precision] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
#### Testing Data
<!-- This should link to a Dataset Card if possible. -->
### Model Architecture and Objective
This model is a finetuned version of Llama 3.1, which is an auto-regressive language
model that uses an optimized transformer architecture. |