jploski
/

retnet-mini-shakespeare

Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

jploski commited on Aug 5, 2023

Commit

741346b

•

1 Parent(s): 4b2b1b5

Update README.md

Files changed (1) hide show

README.md +13 -10

README.md CHANGED Viewed

@@ -5,31 +5,34 @@ tags:
 model-index:
 - name: retnet-mini-shakespeare
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # retnet-mini-shakespeare
-This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 2.7718
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -59,4 +62,4 @@ The following hyperparameters were used during training:
 - Transformers 4.31.0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.3
-- Tokenizers 0.13.3

 model-index:
 - name: retnet-mini-shakespeare
   results: []
+pipeline_tag: text-generation
 ---
 # retnet-mini-shakespeare
+This model was trained from scratch on "tinyshakespeare" text file.
 ## Model description
+A tiny model similar to jploski/falcon-mini-shakespeare, to demonstrate training and recurrent inference using a retention network (https://arxiv.org/pdf/2307.08621.pdf).
+The code utilizes Sehyun Choi's implementation of retention network (https://github.com/syncdoth/RetNet) with configuration parameters changed to make it a very tiny model.
+- **License:** Apache 2.0.
 ## Intended uses & limitations
+Intended to demonstrate training and (recurrent O(1)) inference using a retention network
 ## Training and evaluation data
+https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt
 ## Training procedure
+Just used the single tinyshakespeare text file as both the training and validation set (split up into paragraphs). See:
+https://colab.research.google.com/drive/1wZnM7FCe4TsQpoamJ7NDAuQfA3DYiwHi?usp=sharing
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - Transformers 4.31.0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.3
+- Tokenizers 0.13.3