BahasaGPT-Chat
Introduction
This document provides an overview of the BahasaGPT-Chat model, which is a fine-tuned model for a specific task in the Indonesian language. The model is based on the Bloomz-7B-mt architecture and is fine-tuned using a dataset of over 120000 Chat instructions based.
Model Details
Model Name: BahasaGPT-Chat
Model Source: Bloomz-7B-mt
Dataset for Fine-Tuning: Over 120k Indonesia Instruct Dataset generated using the Alpaca method from the following sources:
- Stanford Alpaca
- [Baize-Chatbot] (https://github.com/project-baize/baize-chatbot)
- Translated instructions from OA (Anh/data at main · LAION-AI/Anh)
Fine-Tuning Process
The BahasaGPT-1 model was fine-tuned using a dataset of over 120k Indonesian instructions, which were generated using [Baize-Chatbot] (https://github.com/project-baize/baize-chatbot) method with addition alpaca and OA Translation dataset. This combination of datasets allowed the model to be better adapted to the specific needs of Indonesian language tasks.
The fine-tuning process involved adjusting the model's weights and biases based on the input dataset. This was done iteratively to optimize the model's performance for the specific task in the Indonesian language.
Known Limitations
Despite the successful fine-tuning, the BahasaGPT-1 model still has some limitations:
Hallucination: The model sometimes generates outputs that may seem plausible but are not based on the input data. This may lead to incorrect or nonsensical responses in some cases.
Bias: The BahasaGPT-1 model, like other AI language models, can exhibit various forms of bias due to the data it was trained on. This includes, but is not limited to, gender, racial, and cultural biases. As a result, the model may generate outputs that perpetuate stereotypes, exhibit unfair treatment, or show preference for specific groups or perspectives. Efforts have been made to mitigate these biases, but they may still be present in the model's responses.
Conclusion
The BahasaGPT-1 model is a fine-tuned language model for Indonesian language tasks, based on the Bloomz-7B-mt architecture. The model was trained on a dataset of over 120k Indonesian instructions generated using using [Baize-Chatbot] (https://github.com/project-baize/baize-chatbot) method with addition alpaca and OA Translation dataset. Despite some limitations, such as occasional hallucination, the model provides a valuable tool for working with Indonesian language tasks.
How to Run
For Gradio Demo : Gradio Code
For Colab Using (Int8) : Colab
- Downloads last month
- 13