wanadzhar913
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ We have finetuned [mesolitica/malaysian-mistral-7b-32k-instructions-v4](https://
|
|
18 |
|
19 |
### Training Details
|
20 |
|
21 |
-
Overall, solely training on the [Boolq-Malay](https://huggingface.co/datasets/wanadzhar913/boolq-malay) dataset (comprised of both Malay and English versions of the original [Boolq](https://huggingface.co/datasets/google/boolq) dataset), we use the following training parameters and obtain the following training results:
|
22 |
|
23 |
- **No. of Epochs:** 0.504
|
24 |
- **Per Device Train Batch Size:** 4
|
@@ -63,9 +63,9 @@ nf4_config = BitsAndBytesConfig(
|
|
63 |
bnb_4bit_compute_dtype=getattr(torch, TORCH_DTYPE)
|
64 |
)
|
65 |
|
66 |
-
tokenizer = AutoTokenizer.from_pretrained('wanadzhar913/malaysian-mistral-llmasajudge-
|
67 |
model = AutoModelForCausalLM.from_pretrained(
|
68 |
-
'wanadzhar913/malaysian-mistral-llmasajudge-
|
69 |
use_flash_attention_2 = True,
|
70 |
quantization_config = nf4_config
|
71 |
)
|
|
|
18 |
|
19 |
### Training Details
|
20 |
|
21 |
+
Overall, solely training on the [Boolq-Malay](https://huggingface.co/datasets/wanadzhar913/boolq-malay) dataset (comprised of both Malay and English versions of the original [Boolq](https://huggingface.co/datasets/google/boolq) dataset) and Google Colab's A100 GPU (40GB VRAM), we use the following training parameters and obtain the following training results:
|
22 |
|
23 |
- **No. of Epochs:** 0.504
|
24 |
- **Per Device Train Batch Size:** 4
|
|
|
63 |
bnb_4bit_compute_dtype=getattr(torch, TORCH_DTYPE)
|
64 |
)
|
65 |
|
66 |
+
tokenizer = AutoTokenizer.from_pretrained('wanadzhar913/malaysian-mistral-llmasajudge-v2')
|
67 |
model = AutoModelForCausalLM.from_pretrained(
|
68 |
+
'wanadzhar913/malaysian-mistral-llmasajudge-v2',
|
69 |
use_flash_attention_2 = True,
|
70 |
quantization_config = nf4_config
|
71 |
)
|