Bahasalab
/

BahasaGPT-1_int8

Inference Endpoints

Model card Files Files and versions Community

BahasaGPT-1_int8 / README.md

Samsul Rahmadani

init files

a6b07e6 over 1 year ago

|

history blame contribute delete

2.22 kB

	---
	license: bigscience-bloom-rail-1.0
	---
	# BahasaGPT-1 Fine-Tuning Documentation Summary (INT (8-BIT))

	## Introduction

	This document provides an overview of the BahasaGPT-1 model, which is a fine-tuned model for a specific task in the Indonesian language. The model is based on the Bloomz-7B-mt architecture and is fine-tuned using a dataset of over 70,000 Indonesian instructions.

	## Model Details

	Model Name: BahasaGPT-1

	Model Source: Bloomz-7B-mt

	Dataset for Fine-Tuning: Over 70k Indonesia Instruct Dataset generated using the Alpaca method from the following sources:

	- [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)
	- Translated instructions from OA ([Anh/data at main · LAION-AI/Anh](https://github.com/LAION-AI/Anh))

	## Fine-Tuning Process

	The BahasaGPT-1 model was fine-tuned using a dataset of over 70,000 Indonesian instructions, which were generated using the Alpaca method from Stanford and translated instructions from OA. This combination of datasets allowed the model to be better adapted to the specific needs of Indonesian language tasks.

	The fine-tuning process involved adjusting the model's weights and biases based on the input dataset. This was done iteratively to optimize the model's performance for the specific task in the Indonesian language.

	## Known Limitations

	Despite the successful fine-tuning, the BahasaGPT-1 model still has some limitations:

	1. Hallucination: The model sometimes generates outputs that may seem plausible but are not based on the input data. This may lead to incorrect or nonsensical responses in some cases.

	2. Repeated Tokens: The model occasionally produces repeated tokens in the output, which may affect the overall coherence and readability of the generated text.

	## Conclusion

	The BahasaGPT-1 model is a fine-tuned language model for Indonesian language tasks, based on the Bloomz-7B-mt architecture. The model was trained on a dataset of over 70,000 Indonesian instructions generated using the Alpaca method and translated instructions from OA. Despite some limitations, such as occasional hallucination and repeated tokens, the model provides a valuable tool for working with Indonesian language tasks.