wanadzhar913 commited on
Commit
95d4a13
·
verified ·
1 Parent(s): 99e280b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -18,7 +18,7 @@ We have finetuned [mesolitica/malaysian-mistral-7b-32k-instructions-v4](https://
18
 
19
  ### Training Details
20
 
21
- Overall, solely training on the [Boolq-Malay](https://huggingface.co/datasets/wanadzhar913/boolq-malay) dataset (comprised of both Malay and English versions of the original [Boolq](https://huggingface.co/datasets/google/boolq) dataset), we use the following training parameters and obtain the following training results:
22
 
23
  - **No. of Epochs:** 0.504
24
  - **Per Device Train Batch Size:** 4
@@ -63,9 +63,9 @@ nf4_config = BitsAndBytesConfig(
63
  bnb_4bit_compute_dtype=getattr(torch, TORCH_DTYPE)
64
  )
65
 
66
- tokenizer = AutoTokenizer.from_pretrained('wanadzhar913/malaysian-mistral-llmasajudge-v3')
67
  model = AutoModelForCausalLM.from_pretrained(
68
- 'wanadzhar913/malaysian-mistral-llmasajudge-v3',
69
  use_flash_attention_2 = True,
70
  quantization_config = nf4_config
71
  )
 
18
 
19
  ### Training Details
20
 
21
+ Overall, solely training on the [Boolq-Malay](https://huggingface.co/datasets/wanadzhar913/boolq-malay) dataset (comprised of both Malay and English versions of the original [Boolq](https://huggingface.co/datasets/google/boolq) dataset) and Google Colab's A100 GPU (40GB VRAM), we use the following training parameters and obtain the following training results:
22
 
23
  - **No. of Epochs:** 0.504
24
  - **Per Device Train Batch Size:** 4
 
63
  bnb_4bit_compute_dtype=getattr(torch, TORCH_DTYPE)
64
  )
65
 
66
+ tokenizer = AutoTokenizer.from_pretrained('wanadzhar913/malaysian-mistral-llmasajudge-v2')
67
  model = AutoModelForCausalLM.from_pretrained(
68
+ 'wanadzhar913/malaysian-mistral-llmasajudge-v2',
69
  use_flash_attention_2 = True,
70
  quantization_config = nf4_config
71
  )