mattdangerw commited on
Commit
ab2afa8
·
verified ·
1 Parent(s): be70570

Update README.md with new model card content

Browse files
Files changed (1) hide show
  1. README.md +167 -20
README.md CHANGED
@@ -1,23 +1,170 @@
1
  ---
2
  library_name: keras-hub
3
  ---
4
- This is a [`Phi3` model](https://keras.io/api/keras_hub/models/phi3) uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
5
- Model config:
6
- * **name:** phi3_backbone_1
7
- * **trainable:** True
8
- * **vocabulary_size:** 32064
9
- * **num_layers:** 32
10
- * **num_query_heads:** 32
11
- * **hidden_dim:** 3072
12
- * **intermediate_dim:** 8192
13
- * **num_key_value_heads:** 32
14
- * **layer_norm_epsilon:** 1e-05
15
- * **dropout:** 0.0
16
- * **max_sequence_length:** 131072
17
- * **pretraining_sequence_length:** 4096
18
- * **rope_max_wavelength:** 10000.0
19
- * **rope_scaling_type:** su
20
- * **rope_scaling_short_factor:** [1.05, 1.05, 1.05, 1.1, 1.1, 1.1500000000000001, 1.2000000000000002, 1.2500000000000002, 1.3000000000000003, 1.3500000000000003, 1.5000000000000004, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.0500000000000007, 2.0500000000000007, 2.0500000000000007, 2.1000000000000005, 2.1000000000000005, 2.1000000000000005, 2.1500000000000004, 2.1500000000000004, 2.3499999999999996, 2.549999999999999, 2.5999999999999988, 2.5999999999999988, 2.7499999999999982, 2.849999999999998, 2.849999999999998, 2.9499999999999975]
21
- * **rope_scaling_long_factor:** [1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0799999237060547, 1.2299998998641968, 1.2299998998641968, 1.2999999523162842, 1.4499999284744263, 1.5999999046325684, 1.6499998569488525, 1.8999998569488525, 2.859999895095825, 3.68999981880188, 5.419999599456787, 5.489999771118164, 5.489999771118164, 9.09000015258789, 11.579999923706055, 15.65999984741211, 15.769999504089355, 15.789999961853027, 18.360000610351562, 21.989999771118164, 23.079999923706055, 30.009998321533203, 32.35000228881836, 32.590003967285156, 35.56000518798828, 39.95000457763672, 53.840003967285156, 56.20000457763672, 57.95000457763672, 59.29000473022461, 59.77000427246094, 59.920005798339844, 61.190006256103516, 61.96000671386719, 62.50000762939453, 63.3700065612793, 63.48000717163086, 63.48000717163086, 63.66000747680664, 63.850006103515625, 64.08000946044922, 64.760009765625, 64.80001068115234, 64.81001281738281, 64.81001281738281]
22
-
23
- This model card has been generated automatically and should be completed by the model author. See [Model Cards documentation](https://huggingface.co/docs/hub/model-cards) for more information.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: keras-hub
3
  ---
4
+ ### Model Overview
5
+ Phi-3 is a set of large language models published by Microsoft. Models are instruction tuned, and range in size from 3 billion to 14 billion parameters. See the model card below for benchmarks, data sources, and intended use cases.
6
+
7
+ Weights are released under the [MIT License](https://opensource.org/license/mit). Keras model code is released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE).
8
+
9
+ ## Links
10
+
11
+ * [Phi-3 Quickstart Notebook](https://www.kaggle.com/code/matthewdwatson/phi-3-quickstart)
12
+ * [Phi-3 API Documentation](https://keras.io/api/keras_hub/models/phi3/)
13
+ * [Phi-3 Model Card](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct)
14
+ * [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/)
15
+ * [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/)
16
+
17
+ ## Installation
18
+
19
+ Keras and KerasHub can be installed with:
20
+
21
+ ```
22
+ pip install -U -q keras-hub
23
+ pip install -U -q keras>=3
24
+ ```
25
+
26
+ Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instruction on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page.
27
+
28
+ ## Presets
29
+
30
+ The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
31
+
32
+ | Preset name | Parameters | Description |
33
+ |-----------------------|------------|---------------|
34
+ |` phi3_mini_4k_instruct_en` | 3.82B | 3B model with 4K max context |
35
+ | `phi3_mini_128k_instruct_en` | 3.82B | 3B model with 128K max context |
36
+
37
+ ## Prompts
38
+
39
+ Phi-3 models are instruction tuned on turn by turn conversations and should be prompted with examples that precisely match the training data. Specifically, you must alternate user and assistant turns that begin and end with special tokens. New lines do matter. See the following for an example:
40
+
41
+ ```python
42
+ prompt = """<|user|>
43
+ Hello!<|end|>
44
+ <|assistant|>
45
+ Hello! How are you?<|end|>
46
+ <|user|>
47
+ I'm great. Could you help me with a task?<|end|>
48
+ """
49
+ ```
50
+
51
+ ## Example Usage
52
+ ```shell
53
+ pip install -U -q keras-hub
54
+ ```
55
+
56
+ ```python
57
+ import keras
58
+ import keras_hub
59
+ import numpy as np
60
+ ```
61
+
62
+ Use `generate()` to do text generation.
63
+
64
+ ```python
65
+ phi3_lm = keras_hub.models.Phi3CausalLM.from_preset("phi3_mini_128k_instruct_en")
66
+ phi3_lm.generate("<|user|>\nHow to explain Internet for a medieval knight?<|end|>\n<|assistant|>", max_length=500)
67
+
68
+ # Generate with batched prompts.
69
+ phi3_lm.generate([
70
+ "<|user|>\nWhat is Keras?<|end|>\n<|assistant|>",
71
+ "<|user|>\nGive me your best brownie recipe.<|end|>\n<|assistant|>",
72
+ ], max_length=500)
73
+ ```
74
+
75
+ Compile the `generate()` function with a custom sampler.
76
+
77
+ ```python
78
+ phi3_lm = keras_hub.models.Phi3CausalLM.from_preset("phi3_mini_128k_instruct_en")
79
+ phi3_lm.compile(sampler="greedy")
80
+ phi3_lm.generate("<|user|>\nWhat is Keras?<|end|>\n<|assistant|>", max_length=30)
81
+
82
+ phi3_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
83
+ phi3_lm.generate("<|user|>\nWhat is Keras?<|end|>\n<|assistant|>", max_length=30)
84
+ ```
85
+
86
+ Use `generate()` without preprocessing.
87
+
88
+ ```python
89
+ prompt = {
90
+ "token_ids": np.array([[306, 864, 304, 1827, 0, 0, 0, 0, 0, 0]] * 2),
91
+ # Use `"padding_mask"` to indicate values that should not be overridden.
92
+ "padding_mask": np.array([[1, 1, 1, 1, 0, 0, 0, 0, 0, 0]] * 2),
93
+ }
94
+
95
+ phi3_lm = keras_hub.models.Phi3CausalLM.from_preset(
96
+ "phi3_mini_128k_instruct_en",
97
+ preprocessor=None,
98
+ dtype="bfloat16"
99
+ )
100
+ phi3_lm.generate(prompt)
101
+ ```
102
+
103
+ Call `fit()` on a single batch.
104
+
105
+ ```python
106
+ features = ["The quick brown fox jumped.", "I forgot my homework."]
107
+ phi3_lm = keras_hub.models.Phi3CausalLM.from_preset("phi3_mini_128k_instruct_en")
108
+ phi3_lm.fit(x=features, batch_size=2)
109
+ ```
110
+
111
+ ## Example Usage with Hugging Face URI
112
+
113
+ ```shell
114
+ pip install -U -q keras-hub
115
+ ```
116
+
117
+ ```python
118
+ import keras
119
+ import keras_hub
120
+ import numpy as np
121
+ ```
122
+
123
+ Use `generate()` to do text generation.
124
+
125
+ ```python
126
+ phi3_lm = keras_hub.models.Phi3CausalLM.from_preset("hf://keras/phi3_mini_128k_instruct_en")
127
+ phi3_lm.generate("<|user|>\nHow to explain Internet for a medieval knight?<|end|>\n<|assistant|>", max_length=500)
128
+
129
+ # Generate with batched prompts.
130
+ phi3_lm.generate([
131
+ "<|user|>\nWhat is Keras?<|end|>\n<|assistant|>",
132
+ "<|user|>\nGive me your best brownie recipe.<|end|>\n<|assistant|>",
133
+ ], max_length=500)
134
+ ```
135
+
136
+ Compile the `generate()` function with a custom sampler.
137
+
138
+ ```python
139
+ phi3_lm = keras_hub.models.Phi3CausalLM.from_preset("hf://keras/phi3_mini_128k_instruct_en")
140
+ phi3_lm.compile(sampler="greedy")
141
+ phi3_lm.generate("<|user|>\nWhat is Keras?<|end|>\n<|assistant|>", max_length=30)
142
+
143
+ phi3_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
144
+ phi3_lm.generate("<|user|>\nWhat is Keras?<|end|>\n<|assistant|>", max_length=30)
145
+ ```
146
+
147
+ Use `generate()` without preprocessing.
148
+
149
+ ```python
150
+ prompt = {
151
+ "token_ids": np.array([[306, 864, 304, 1827, 0, 0, 0, 0, 0, 0]] * 2),
152
+ # Use `"padding_mask"` to indicate values that should not be overridden.
153
+ "padding_mask": np.array([[1, 1, 1, 1, 0, 0, 0, 0, 0, 0]] * 2),
154
+ }
155
+
156
+ phi3_lm = keras_hub.models.Phi3CausalLM.from_preset(
157
+ "hf://keras/phi3_mini_128k_instruct_en",
158
+ preprocessor=None,
159
+ dtype="bfloat16"
160
+ )
161
+ phi3_lm.generate(prompt)
162
+ ```
163
+
164
+ Call `fit()` on a single batch.
165
+
166
+ ```python
167
+ features = ["The quick brown fox jumped.", "I forgot my homework."]
168
+ phi3_lm = keras_hub.models.Phi3CausalLM.from_preset("hf://keras/phi3_mini_128k_instruct_en")
169
+ phi3_lm.fit(x=features, batch_size=2)
170
+ ```