MBZUAI
/

MobiLlama-05B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

omkarthawakar commited on Feb 26, 2024

Commit

5873179

·

verified ·

1 Parent(s): 093eb53

Update README.md

Files changed (1) hide show

README.md +14 -4

README.md CHANGED Viewed

@@ -42,21 +42,31 @@ The current `transformers` version can be verified with: `pip list | grep transf
 To load a specific checkpoint, simply pass a revision with a value between `"ckpt_000"` and `"ckpt_358"`. If no revision is provided, it will load `"ckpt_359"`, which is the final checkpoint.
 ```python
-import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
-torch.set_default_device("cuda")
 model = AutoModelForCausalLM.from_pretrained("MBZUAI/MobiLlama-05B", torch_dtype="auto", trust_remote_code=True)
 tokenizer = AutoTokenizer.from_pretrained("MBZUAI/MobiLlama-05B", trust_remote_code=True)
-text = "Write a C language program to find fibonnaci series?"
 input_ids = tokenizer(text, return_tensors="pt").to('cuda').input_ids
 outputs = model.generate(input_ids, max_length=1000, repetition_penalty=1.2, pad_token_id=tokenizer.eos_token_id)
 print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())
 ```
 ## Intended Uses
 Given the nature of the training data, the MobiLlama-05B model is best suited for prompts using the QA format, the chat format, and the code format.

 To load a specific checkpoint, simply pass a revision with a value between `"ckpt_000"` and `"ckpt_358"`. If no revision is provided, it will load `"ckpt_359"`, which is the final checkpoint.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model = AutoModelForCausalLM.from_pretrained("MBZUAI/MobiLlama-05B", torch_dtype="auto", trust_remote_code=True)
 tokenizer = AutoTokenizer.from_pretrained("MBZUAI/MobiLlama-05B", trust_remote_code=True)
+text = "I was dancing in the river when "
 input_ids = tokenizer(text, return_tensors="pt").to('cuda').input_ids
 outputs = model.generate(input_ids, max_length=1000, repetition_penalty=1.2, pad_token_id=tokenizer.eos_token_id)
 print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())
 ```
+## Evaluation
+| Evaluation Benchmark | MobiLlama-0.5B | MobiLlama-0.8B | MobiLlama-1.2B |
+| ----------- | ----------- | ----------- |
+| HellaSwag | 0.5252 | 0.5409 | 0.6299 |
+| MMLU | 0.2645 | 0.2692 | 0.2423 |
+| Arc Challenge | 0.2952 | 0.3020 | 0.3455 |
+| TruthfulQA | 0.3805 | 0.3848 | 0.3557 |
+| CrowsPairs | 0.6403 | 0.6482 | 0.6812 |
+| PIQA | 0.7203 | 0.7317 | 0.7529 |
+| Race | 0.3368 | 0.3337 | 0.3531 |
+| SIQA | 0.4022 | 0.4160 | 0.4196 |
+| Winogrande | 0.5753 | 0.5745 | 0.6108 |
 ## Intended Uses
 Given the nature of the training data, the MobiLlama-05B model is best suited for prompts using the QA format, the chat format, and the code format.