damerajee commited on
Commit
6211225
·
verified ·
1 Parent(s): 6c365bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -67
README.md CHANGED
@@ -17,70 +17,3 @@ base_model:
17
  - togethercomputer/LLaMA-2-7B-32K
18
  ---
19
 
20
- # Llamoe-test
21
-
22
- Llamoe-test is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
23
- * [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
24
- * [syzymon/long_llama_code_7b_instruct](https://huggingface.co/syzymon/long_llama_code_7b_instruct)
25
- * [georgesung/llama2_7b_chat_uncensored](https://huggingface.co/georgesung/llama2_7b_chat_uncensored)
26
- * [togethercomputer/LLaMA-2-7B-32K](https://huggingface.co/togethercomputer/LLaMA-2-7B-32K)
27
-
28
- ## 🧩 Configuration
29
-
30
- ```yaml
31
- base_model: meta-llama/Llama-2-7b-chat-hf
32
- gate_mode: random
33
- dtype: bfloat16
34
- experts_per_token: 2
35
- experts:
36
- - source_model: meta-llama/Llama-2-7b-hf
37
- positive_prompts:
38
- - "should be able to converse properly"
39
- negative_prompts:
40
- - "Uncensored in my opinion"
41
-
42
- - source_model: syzymon/long_llama_code_7b_instruct
43
- positive_prompts:
44
- - "Perform pretty well in coding question"
45
-
46
- negative_prompts:
47
- - "Is quite bad in C++"
48
-
49
- - source_model: georgesung/llama2_7b_chat_uncensored
50
- positive_prompts:
51
- - "Uncensored"
52
-
53
- negative_prompts:
54
- - "really bad in high school grade math and science"
55
-
56
- - source_model: togethercomputer/LLaMA-2-7B-32K
57
- positive_prompts:
58
- - "really good in long context question answering"
59
-
60
- negative_prompts:
61
- - "incorrect or biased content"
62
- ```
63
-
64
- ## 💻 Usage
65
-
66
- ```python
67
- !pip install -qU transformers bitsandbytes accelerate
68
-
69
- from transformers import AutoTokenizer
70
- import transformers
71
- import torch
72
-
73
- model = "damerajee/Llamoe-test"
74
-
75
- tokenizer = AutoTokenizer.from_pretrained(model)
76
- pipeline = transformers.pipeline(
77
- "text-generation",
78
- model=model,
79
- model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
80
- )
81
-
82
- messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
83
- prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
84
- outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
85
- print(outputs[0]["generated_text"])
86
- ```
 
17
  - togethercomputer/LLaMA-2-7B-32K
18
  ---
19