qaihm-bot commited on
Commit
57e957b
·
verified ·
1 Parent(s): 60ba704

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -42,11 +42,10 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/m
42
  - Supported languages: English.
43
  - TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
44
  - Response Rate: Rate of response generation after the first response token.
45
- - Tiny MMLU: Tiny MMLU (Massive Multitask Language Understanding) is an English language benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings. This makes the benchmark more challenging and more similar to how we evaluate humans.
46
 
47
- | Model | Device | Chipset | Target Runtime | Response Rate (tokens per second) | Time To First Token (range, seconds) | Tiny MMLU |
48
- |---|---|---|---|---|---|---|
49
- | Mistral-7B-Instruct-v0.3 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 12.56 | 0.16565 - 5.3008 | 58.85% | Use Export Script |
50
 
51
  ## Deploying Mistral 7B Instruct v0.3 on-device
52
 
 
42
  - Supported languages: English.
43
  - TTFT: Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt. The lower bound is for a short prompt (up to 128 tokens, i.e., one iteration of the prompt processor) and the upper bound is for a prompt using the full context length (4096 tokens).
44
  - Response Rate: Rate of response generation after the first response token.
 
45
 
46
+ | Model | Device | Chipset | Target Runtime | Response Rate (tokens per second) | Time To First Token (range, seconds)
47
+ |---|---|---|---|---|---|
48
+ | Mistral-7B-Instruct-v0.3 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 12.56 | 0.16565 - 5.3008 | -- | Use Export Script |
49
 
50
  ## Deploying Mistral 7B Instruct v0.3 on-device
51