olafgeibig commited on
Commit
9a874b2
1 Parent(s): e3e1e05

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +95 -0
README.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: microsoft/phi-2
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - teknium/OpenHermes-2.5
8
+ model-index:
9
+ - name: phi-2-OpenHermes-2.5
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # phi-2-OpenHermes-2.5
17
+ This is a GGUF quantize with my favorite quantizations of [minghaowu/phi-2-OpenHermes-2.5](https://huggingface.co/minghaowu/phi-2-OpenHermes-2.5). A phi-2 fine-tuned on [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5).
18
+
19
+ I quickly quantized this model using a modified version of [AutoGGUF](https://t.co/oUuxN2fvSX) from [Maxime Labonne](https://huggingface.co/mlabonne)
20
+
21
+ The prompt format is a little bit guesswork but it seems to work. Here is my Ollama modelfile:
22
+ ```
23
+ FROM ./phi-2-openhermes-2.5.Q5_K_M.gguf
24
+ PARAMETER num_ctx 2048
25
+ TEMPLATE """{{ .System }}
26
+ ### USER: {{ .Prompt }}<|endoftext|>
27
+ ### ASSISTANT:
28
+ """
29
+ PARAMETER stop "<|endoftext|>"
30
+ ```
31
+
32
+ Many Kudos to [Microsoft](https://huggingface.co/microsoft), [Teknium](https://huggingface.co/datasets/teknium) and [Minghao Wu]((https://huggingface.co/minghaowu)
33
+
34
+ ---
35
+
36
+ # Original Modelcard
37
+ This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on the teknium/OpenHermes-2.5 dataset.
38
+
39
+ ## Model description
40
+
41
+ More information needed
42
+
43
+ ## Intended uses & limitations
44
+
45
+ More information needed
46
+
47
+ ## Training and evaluation data
48
+
49
+ More information needed
50
+
51
+ ## Training procedure
52
+
53
+ ### Training hyperparameters
54
+
55
+ The following hyperparameters were used during training:
56
+ - learning_rate: 5e-05
57
+ - train_batch_size: 4
58
+ - eval_batch_size: 8
59
+ - seed: 42
60
+ - distributed_type: multi-GPU
61
+ - num_devices: 2
62
+ - gradient_accumulation_steps: 16
63
+ - total_train_batch_size: 128
64
+ - total_eval_batch_size: 16
65
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
66
+ - lr_scheduler_type: linear
67
+ - lr_scheduler_warmup_ratio: 0.1
68
+ - num_epochs: 1.0
69
+
70
+ ### Training results
71
+
72
+
73
+
74
+ ### Framework versions
75
+
76
+ - Transformers 4.37.2
77
+ - Pytorch 2.0.1+cu117
78
+ - Datasets 2.16.1
79
+ - Tokenizers 0.15.1
80
+
81
+ ### Inference
82
+
83
+ ```
84
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
85
+
86
+ model_id = "minghaowu/phi-2-OpenHermes-2.5"
87
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
88
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
89
+ pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")
90
+
91
+ your_instruction = <your_instruction>
92
+ infer_prompt = f"### USER: {your_instruction} <|endoftext|>\n### ASSISTANT:"
93
+ output = pipe(infer_prompt, do_sample=True, max_new_tokens=256)[0]["generated_text"]
94
+ print(output)
95
+ ```