Text Generation
KerasHub
Keras
English
Divyasreepat commited on
Commit
28d7838
1 Parent(s): c1998f9

Update README.md with new model card content

Browse files
Files changed (1) hide show
  1. README.md +161 -11
README.md CHANGED
@@ -1,14 +1,164 @@
1
  ---
2
  library_name: keras-hub
3
  ---
4
- This is a [`OPT` model](https://keras.io/api/keras_hub/models/opt) uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
5
- Model config:
6
- * **vocabulary_size:** 50272
7
- * **num_layers:** 24
8
- * **num_heads:** 32
9
- * **hidden_dim:** 2048
10
- * **intermediate_dim:** 8192
11
- * **dropout:** 0.1
12
- * **max_sequence_length:** 2048
13
-
14
- This model card has been generated automatically and should be completed by the model author. See [Model Cards documentation](https://huggingface.co/docs/hub/model-cards) for more information.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: keras-hub
3
  ---
4
+ ### Model Overview
5
+ An OPT decoder network.
6
+
7
+ This class implements a Transformer-based decoder model as described in
8
+ ["OPT: Open Pre-trained Transformer Language Models"](https://arxiv.org/abs/2205.01068).
9
+ The default constructor gives a fully customizable, randomly initialized OPT
10
+ model with any number of layers, heads, and embedding dimensions. To load
11
+ preset architectures and weights, use the `from_preset()` constructor.
12
+
13
+ Disclaimer: Pre-trained models are provided on an "as is" basis, without
14
+ warranties or conditions of any kind. The underlying model is provided by a
15
+ third party and subject to a separate license, available
16
+ [here](https://github.com/facebookresearch/fairseq/).
17
+
18
+
19
+ __Arguments__
20
+
21
+
22
+ - __vocabulary_size__: int. The size of the token vocabulary.
23
+ - __num_layers__: int. The number of transformer decoder layers.
24
+ - __num_heads__: int. The number of attention heads for each transformer.
25
+ The hidden size must be divisible by the number of attention heads.
26
+ - __hidden_dim__: int. The hidden size of the transformer decoder layers.
27
+ - __intermediate_dim__: int. The output dimension of the first Dense layer in
28
+ a two-layer feedforward network for each transformer decoder layer.
29
+ - __dropout__: float. Dropout probability for the Transformer decoder.
30
+ - __max_sequence_length__: int. The maximum sequence length that this decoder
31
+ can consume. If `None`, `max_sequence_length` uses the value from
32
+ sequence length. This determines the variable shape for positional
33
+ embeddings.
34
+
35
+ ### Example Usage
36
+ ```python
37
+ import keras
38
+ import keras_hub
39
+ import numpy as np
40
+ ```
41
+
42
+ Use `generate()` to do text generation.
43
+ ```python
44
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset("opt_1.3b_en")
45
+ opt_lm.generate("I want to say", max_length=30)
46
+
47
+ # Generate with batched prompts.
48
+ opt_lm.generate(["This is a", "Where are you"], max_length=30)
49
+ ```
50
+
51
+ Compile the `generate()` function with a custom sampler.
52
+ ```python
53
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset("opt_1.3b_en")
54
+ opt_lm.compile(sampler="greedy")
55
+ opt_lm.generate("I want to say", max_length=30)
56
+
57
+ opt_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
58
+ opt_lm.generate("I want to say", max_length=30)
59
+ ```
60
+
61
+ Use `generate()` without preprocessing.
62
+ ```python
63
+ # Prompt the model with `5338, 318` (the token ids for `"Who is"`).
64
+ # Use `"padding_mask"` to indicate values that should not be overridden.
65
+ prompt = {
66
+ "token_ids": np.array([[5338, 318, 0, 0, 0]] * 2),
67
+ "padding_mask": np.array([[1, 1, 0, 0, 0]] * 2),
68
+ }
69
+
70
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset(
71
+ "opt_1.3b_en",
72
+ preprocessor=None,
73
+ )
74
+ opt_lm.generate(prompt)
75
+ ```
76
+
77
+ Call `fit()` on a single batch.
78
+ ```python
79
+ features = ["The quick brown fox jumped.", "I forgot my homework."]
80
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset("opt_1.3b_en")
81
+ opt_lm.fit(x=features, batch_size=2)
82
+ ```
83
+
84
+ Call `fit()` without preprocessing.
85
+ ```python
86
+ x = {
87
+ "token_ids": np.array([[1, 2, 3, 4, 5]] * 2),
88
+ "padding_mask": np.array([[1, 1, 1, 1, 1]] * 2),
89
+ }
90
+ y = np.array([[2, 3, 4, 5, 0]] * 2)
91
+ sw = np.array([[1, 1, 1, 1, 1]] * 2)
92
+
93
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset(
94
+ "opt_1.3b_en",
95
+ preprocessor=None,
96
+ )
97
+ opt_lm.fit(x=x, y=y, sample_weight=sw, batch_size=2)
98
+ ```
99
+
100
+ ## Example Usage with Hugging Face URI
101
+
102
+ ```python
103
+ import keras
104
+ import keras_hub
105
+ import numpy as np
106
+ ```
107
+
108
+ Use `generate()` to do text generation.
109
+ ```python
110
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset("hf://keras/opt_1.3b_en")
111
+ opt_lm.generate("I want to say", max_length=30)
112
+
113
+ # Generate with batched prompts.
114
+ opt_lm.generate(["This is a", "Where are you"], max_length=30)
115
+ ```
116
+
117
+ Compile the `generate()` function with a custom sampler.
118
+ ```python
119
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset("hf://keras/opt_1.3b_en")
120
+ opt_lm.compile(sampler="greedy")
121
+ opt_lm.generate("I want to say", max_length=30)
122
+
123
+ opt_lm.compile(sampler=keras_hub.samplers.BeamSampler(num_beams=2))
124
+ opt_lm.generate("I want to say", max_length=30)
125
+ ```
126
+
127
+ Use `generate()` without preprocessing.
128
+ ```python
129
+ # Prompt the model with `5338, 318` (the token ids for `"Who is"`).
130
+ # Use `"padding_mask"` to indicate values that should not be overridden.
131
+ prompt = {
132
+ "token_ids": np.array([[5338, 318, 0, 0, 0]] * 2),
133
+ "padding_mask": np.array([[1, 1, 0, 0, 0]] * 2),
134
+ }
135
+
136
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset(
137
+ "hf://keras/opt_1.3b_en",
138
+ preprocessor=None,
139
+ )
140
+ opt_lm.generate(prompt)
141
+ ```
142
+
143
+ Call `fit()` on a single batch.
144
+ ```python
145
+ features = ["The quick brown fox jumped.", "I forgot my homework."]
146
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset("hf://keras/opt_1.3b_en")
147
+ opt_lm.fit(x=features, batch_size=2)
148
+ ```
149
+
150
+ Call `fit()` without preprocessing.
151
+ ```python
152
+ x = {
153
+ "token_ids": np.array([[1, 2, 3, 4, 5]] * 2),
154
+ "padding_mask": np.array([[1, 1, 1, 1, 1]] * 2),
155
+ }
156
+ y = np.array([[2, 3, 4, 5, 0]] * 2)
157
+ sw = np.array([[1, 1, 1, 1, 1]] * 2)
158
+
159
+ opt_lm = keras_hub.models.OPTCausalLM.from_preset(
160
+ "hf://keras/opt_1.3b_en",
161
+ preprocessor=None,
162
+ )
163
+ opt_lm.fit(x=x, y=y, sample_weight=sw, batch_size=2)
164
+ ```