jartine commited on
Commit
3c268dd
1 Parent(s): 79aaa08

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md CHANGED
@@ -71,6 +71,65 @@ context size to be available with llamafile for any given model, you can
71
  pass the `-c 0` flag. The default temperature for these llamafiles is
72
  0.8 because it helps for this model. It can be tuned, e.g. `--temp 0`.
73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  ## About llamafile
75
 
76
  llamafile is a new format introduced by Mozilla Ocho on Nov 20th 2023.
 
71
  pass the `-c 0` flag. The default temperature for these llamafiles is
72
  0.8 because it helps for this model. It can be tuned, e.g. `--temp 0`.
73
 
74
+ ## Benchmarks
75
+
76
+ | hardware | model\_filename | size | test | t/s |
77
+ | :----------------------------------------- | :--------------------------------------- | ---------: | ------------: | --------------: |
78
+ | NVIDIA GeForce RTX 4090 (cuBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 7264.74 |
79
+ | NVIDIA GeForce RTX 4090 (cuBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 58.27 |
80
+ | NVIDIA GeForce RTX 4090 (cuBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 4236.95 |
81
+ | NVIDIA GeForce RTX 4090 (cuBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 114.65 |
82
+ | NVIDIA GeForce RTX 4090 (tinyBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 3457.31 |
83
+ | NVIDIA GeForce RTX 4090 (tinyBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 85.20 |
84
+ | NVIDIA GeForce RTX 4090 (tinyBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 1284.87 |
85
+ | NVIDIA GeForce RTX 4090 (tinyBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 49.76 |
86
+ | AMD Radeon RX 7900 XTX (hipBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 3239.27 |
87
+ | AMD Radeon RX 7900 XTX (hipBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 37.41 |
88
+ | AMD Radeon RX 7900 XTX (hipBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 2647.72 |
89
+ | AMD Radeon RX 7900 XTX (hipBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 85.42 |
90
+ | AMD Radeon RX 7900 XTX (tinyBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 1226.20 |
91
+ | AMD Radeon RX 7900 XTX (tinyBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 76.29 |
92
+ | AMD Radeon RX 7900 XTX (tinyBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 1033.91 |
93
+ | AMD Radeon RX 7900 XTX (tinyBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 35.41 |
94
+ | Apple M2 Ultra (60-core Metal GPU) | mistral-7b-instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 761.88 |
95
+ | Apple M2 Ultra (60-core Metal GPU) | mistral-7b-instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 64.15 |
96
+ | Apple M2 Ultra (ARMv8+fp16+dotprod) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 109.18 |
97
+ | Apple M2 Ultra (ARMv8+fp16+dotprod) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 15.17 |
98
+ | Intel Core i9-14900K (alderlake) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 95.87 |
99
+ | Intel Core i9-14900K (alderlake) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 12.66 |
100
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.BF16.gguf | 13.50 GiB | pp512 | 759.25 |
101
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.BF16.gguf | 13.50 GiB | tg16 | 19.29 |
102
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.F16.gguf | 13.50 GiB | pp512 | 559.94 |
103
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.F16.gguf | 13.50 GiB | tg16 | 19.26 |
104
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q8\_0.gguf | 7.17 GiB | pp512 | 518.76 |
105
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q8\_0.gguf | 7.17 GiB | tg16 | 26.31 |
106
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q6\_K.gguf | 5.54 GiB | pp512 | 726.13 |
107
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q6\_K.gguf | 5.54 GiB | tg16 | 38.65 |
108
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_1.gguf | 5.07 GiB | pp512 | 534.04 |
109
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_1.gguf | 5.07 GiB | tg16 | 38.68 |
110
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_K\_M.gguf | 4.78 GiB | pp512 | 723.25 |
111
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_K\_M.gguf | 4.78 GiB | tg16 | 41.13 |
112
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_0.gguf | 4.65 GiB | pp512 | 536.67 |
113
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_0.gguf | 4.65 GiB | tg16 | 42.46 |
114
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_K\_S.gguf | 4.65 GiB | pp512 | 651.05 |
115
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_K\_S.gguf | 4.65 GiB | tg16 | 42.14 |
116
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_1.gguf | 4.24 GiB | pp512 | 572.67 |
117
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_1.gguf | 4.24 GiB | tg16 | 43.19 |
118
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_K\_M.gguf | 4.07 GiB | pp512 | 728.48 |
119
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_K\_M.gguf | 4.07 GiB | tg16 | 44.29 |
120
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_K\_S.gguf | 3.86 GiB | pp512 | 666.82 |
121
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_K\_S.gguf | 3.86 GiB | tg16 | 45.18 |
122
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_0.gguf | 3.83 GiB | pp512 | 562.96 |
123
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_0.gguf | 3.83 GiB | tg16 | 48.02 |
124
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_L.gguf | 3.56 GiB | pp512 | 706.64 |
125
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_L.gguf | 3.56 GiB | tg16 | 46.82 |
126
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_M.gguf | 3.28 GiB | pp512 | 715.62 |
127
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_M.gguf | 3.28 GiB | tg16 | 48.29 |
128
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_S.gguf | 2.95 GiB | pp512 | 722.11 |
129
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_S.gguf | 2.95 GiB | tg16 | 49.76 |
130
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q2\_K.gguf | 2.53 GiB | pp512 | 739.28 |
131
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q2\_K.gguf | 2.53 GiB | tg16 | 53.01 |
132
+
133
  ## About llamafile
134
 
135
  llamafile is a new format introduced by Mozilla Ocho on Nov 20th 2023.