Initial model

Files changed (4) hide show

README.md ADDED Viewed

+---
+language:
+- id
+datasets:
+- allenai/c4
+---
+# Indonesian T5 Large
+T5 (Text-to-Text Transfer Transformer) model pretrained on Indonesian mC4 with [extra filtering](https://github.com/Wikidepia/indonesian_datasets/tree/master/dump/mc4). This model is pre-trained only and needs to be fine-tuned to be used for specific tasks.
+## Pretraining Details
+Trained for 500K steps following [`google/t5-v1_1-large`](https://huggingface.co/google/t5-v1_1-large).
+## Model Performance
+TBD
+## Limitations and bias
+This model also has the problem of biased (unethical, harmful, biased) output results due to the bias of the content of the training data, which is associated with the language model using a large-scale corpus. There is potential. Assuming that this problem may occur, please be careful to use it only for applications that do not cause damage.
+## Acknowledgement
+Thanks to Tensorflow Research Cloud for providing TPU v3-8s.

config.json ADDED Viewed

+{
+  "_name_or_path": "/home/patrick/hugging_face/t5/t5-v1_1-large",
+  "architectures": [
+    "T5ForConditionalGeneration"
+  ],
+  "d_ff": 2816,
+  "d_kv": 64,
+  "d_model": 1024,
+  "decoder_start_token_id": 0,
+  "dropout_rate": 0.1,
+  "eos_token_id": 1,
+  "feed_forward_proj": "gated-gelu",
+  "gradient_checkpointing": false,
+  "initializer_factor": 1.0,
+  "is_encoder_decoder": true,
+  "layer_norm_epsilon": 1e-06,
+  "model_type": "t5",
+  "num_decoder_layers": 24,
+  "num_heads": 16,
+  "num_layers": 24,
+  "output_past": true,
+  "pad_token_id": 0,
+  "relative_attention_num_buckets": 32,
+  "tie_word_embeddings": false,
+  "transformers_version": "4.8.1",
+  "use_cache": true,
+  "vocab_size": 32128
+}

pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:4ab9f01c5f3615d0560e155ee31b0adbcbcf8edbbf7d8ec450384dd4c2d3d4c1
+size 3132845093

spiece.model ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ec91e24db6b3ab052b7a93bd6ac3fc0d06727ff3a57d462cada3c00783430173
+size 793027