Spaces:

Trickshotblaster
/

mike-chat

Sleeping

App Files Files Community

Trickshotblaster commited on Aug 9, 2024

Commit

766c9b2

1 Parent(s): 91f17ff

Minor update

Browse files

Files changed (1) hide show

app.py +11 -2

app.py CHANGED Viewed

@@ -9,12 +9,21 @@ demo = gr.Interface(fn=gpt.get_response, inputs=["textbox",
                                                 gr.Slider(0.1, 2.0, value=1.0),
                                                 gr.Dropdown(
             ["mike-chat", "mike-code", "mike-code-600m"], value="mike-chat"),
-                                                 ], outputs=gr.Markdown(line_breaks=True), title="Mike Chat", article="""Mike is the greatest AI ever created. It was trained for about 8 hrs on my pc using fineweb-edu and open orca datasets. While it hallucinates a lot, it seems to be about on par with other lms of its size (about 160M params). Model details:
                                                  block_size: 512
                                                  n_layers: 12
                                                  n_heads: 12
                                                  d_model: 768
-                                                 (Same as gpt-2 but without weight tying)""")
 if __name__ == "__main__":

                                                 gr.Slider(0.1, 2.0, value=1.0),
                                                 gr.Dropdown(
             ["mike-chat", "mike-code", "mike-code-600m"], value="mike-chat"),
+                                                 ], outputs=gr.Markdown(line_breaks=True), title="Mike Chat", article="""
+                                                Notice: if you have a GPU, I would highly recommend cloning the space and running it locally. The CPU provided by spaces isn't very fast.
+                                                 Mike is a small GPT-style language model. It was trained for about 8 hrs on my PC using fineweb-edu and open orca datasets. While it hallucinates a lot, it seems to be about on par with other LMs of its size (about 160M params). Model details:
                                                  block_size: 512
                                                  n_layers: 12
                                                  n_heads: 12
                                                  d_model: 768
+                                                 (Same as gpt-2 but without weight tying)
+                                                 Architecture for Mike-Code-600m:
+                                                 block_size: 256
+                                                 n_layers: 16
+                                                 n_heads: 12
+                                                 d_model: 1536""")
 if __name__ == "__main__":