pankajmathur
/

orca_mini_v9_3_70B

Text Generation

text-generation-inference

Model card Files Files and versions Community

pankajmathur commited on Jan 17

Commit

1723a2c

·

verified ·

1 Parent(s): e9c2775

Update README.md

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -38,8 +38,7 @@ Hello Orca Mini, what can you do for me?<|eot_id|>
 <|start_header_id|>assistant<|end_header_id|>
 ```
-Below shows a code example on how to use this model in default half precision (bfloat16) format
 ```python
 import torch
 from transformers import pipeline
@@ -58,7 +57,7 @@ outputs = pipeline(messages, max_new_tokens=128, do_sample=True, temperature=0.0
 print(outputs[0]["generated_text"][-1])
 ```
-Below shows a code example on how to use this model in 4-bit format via bitsandbytes library
 ```python
 import torch
@@ -86,7 +85,7 @@ print(outputs[0]["generated_text"][-1])
 ```
-Below shows a code example on how to use this model in 8-bit format via bitsandbytes library
 ```python
 import torch

 <|start_header_id|>assistant<|end_header_id|>
 ```
+Below shows a code example on how to use this model in default half precision (bfloat16), it requires around ~133GB VRAM
 ```python
 import torch
 from transformers import pipeline
 print(outputs[0]["generated_text"][-1])
 ```
+Below shows a code example on how to use this model in 4-bit format via bitsandbytes library, it requires around ~39GB VRAM
 ```python
 import torch
 ```
+Below shows a code example on how to use this model in 8-bit format via bitsandbytes library, it requires around ~69GB VRAM
 ```python
 import torch