beratcmn commited on
Commit
feca89e
1 Parent(s): e6b8e7b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -2
README.md CHANGED
@@ -3,14 +3,65 @@ language:
3
  - en
4
  license: apache-2.0
5
  tags:
 
 
 
 
 
6
  - text-generation-inference
7
  - transformers
8
  - unsloth
9
  - llama
10
  - trl
11
- - sft
12
  base_model: beratcmn/Llama-3-11.5B
13
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  # Uploaded model
16
 
@@ -20,4 +71,4 @@ base_model: beratcmn/Llama-3-11.5B
20
 
21
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
 
23
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
3
  - en
4
  license: apache-2.0
5
  tags:
6
+ - merge
7
+ - mergekit
8
+ - lazymergekit
9
+ - meta-llama/Meta-Llama-3-8B
10
+ - beratcmn/Llama-3-11.5B
11
  - text-generation-inference
12
  - transformers
13
  - unsloth
14
  - llama
15
  - trl
 
16
  base_model: beratcmn/Llama-3-11.5B
17
  ---
18
+ # Llama-3-11.5B
19
+
20
+ This model is a Proof of Concept. First 2 Llama-3-8B models has been merged using `Mergekit` and pre-training continued using `QLora` and `Unsloth` for 1000 samples from `roneneldan/TinyStories`.
21
+ Loss still decreases each epoch so I believe this is a successful experiment where there is a lot of room to experiment.
22
+
23
+ Llama-3-11.5B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
24
+ * [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
25
+ * [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
26
+
27
+ ## 🧩 Configuration
28
+
29
+ ```yaml
30
+ slices:
31
+ - sources:
32
+ - model: meta-llama/Meta-Llama-3-8B
33
+ layer_range: [0, 24]
34
+ - sources:
35
+ - model: meta-llama/Meta-Llama-3-8B
36
+ layer_range: [8, 32]
37
+ merge_method: passthrough
38
+ dtype: bfloat16
39
+ ```
40
+
41
+ ## 💻 Usage
42
+
43
+ ```python
44
+ !pip install -qU transformers accelerate
45
+
46
+ from transformers import AutoTokenizer
47
+ import transformers
48
+ import torch
49
+
50
+ model = "beratcmn/Llama-3-11.5B"
51
+ messages = [{"role": "user", "content": "What is a large language model?"}]
52
+
53
+ tokenizer = AutoTokenizer.from_pretrained(model)
54
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
55
+ pipeline = transformers.pipeline(
56
+ "text-generation",
57
+ model=model,
58
+ torch_dtype=torch.float16,
59
+ device_map="auto",
60
+ )
61
+
62
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
63
+ print(outputs[0]["generated_text"])
64
+ ```
65
 
66
  # Uploaded model
67
 
 
71
 
72
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
73
 
74
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)