--- license: apache-2.0 tags: - generated_from_keras_callback model-index: - name: n3wtou/mt5-small-finedtuned-4-swahili results: [] datasets: - csebuetnlp/xlsum language: - sw metrics: - rouge --- # n3wtou/mt5-small-finedtuned-4-swahili This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the [csebuetnlp/xlsum](https://huggingface.co/datasets/csebuetnlp/xlsum/viewer/swahili/train) dataset. It achieves the following results on the evaluation set: - Train Loss: 3.0006 - Validation Loss: 2.7015 - Epoch: 9 ## Model description This model is a fined-tuned google/mt5-small for Kiswahili abstractive text generation ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0003, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0003, 'decay_steps': 9900, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 100, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001} - training_precision: mixed_float16 ### Training results | Train Loss | Validation Loss | Epoch | |:----------:|:---------------:|:-----:| | 7.0434 | 3.2396 | 0 | | 4.3604 | 3.0452 | 1 | | 3.9184 | 2.9186 | 2 | | 3.6516 | 2.8443 | 3 | | 3.4569 | 2.7955 | 4 | | 3.3146 | 2.7645 | 5 | | 3.2039 | 2.7292 | 6 | | 3.1135 | 2.7182 | 7 | | 3.0450 | 2.7040 | 8 | | 3.0006 | 2.7015 | 9 | ### Framework versions - Transformers 4.29.2 - TensorFlow 2.12.0 - Datasets 2.12.0 - Tokenizers 0.13.3