jerryzh168 commited on
Commit
3b07e48
·
verified ·
1 Parent(s): 7852389

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -21,7 +21,7 @@ pipeline_tag: text-generation
21
  You can export the quantized model to an [ExecuTorch](https://github.com/pytorch/executorch) pte file, or use the [quantized pte](https://huggingface.co/pytorch/Phi-4-mini-instruct-8da4w/blob/main/phi4-mini-8da4w.pte) file directly to run on a mobile device, see [Running in a mobile app](#running-in-a-mobile-app).
22
 
23
  # Running in a mobile app
24
- The PTE file can be run with ExecuTorch on a mobile phone. See the [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
25
  On iPhone 15 Pro, the model runs at 17.3 tokens/sec and uses 3206 Mb of memory.
26
 
27
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66049fc71116cebd1d3bdcf4/521rXwIlYS9HIAEBAPJjw.png)
 
21
  You can export the quantized model to an [ExecuTorch](https://github.com/pytorch/executorch) pte file, or use the [quantized pte](https://huggingface.co/pytorch/Phi-4-mini-instruct-8da4w/blob/main/phi4-mini-8da4w.pte) file directly to run on a mobile device, see [Running in a mobile app](#running-in-a-mobile-app).
22
 
23
  # Running in a mobile app
24
+ The [PTE file](https://huggingface.co/pytorch/Phi-4-mini-instruct-8da4w/blob/main/phi4-mini-8da4w.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
25
  On iPhone 15 Pro, the model runs at 17.3 tokens/sec and uses 3206 Mb of memory.
26
 
27
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66049fc71116cebd1d3bdcf4/521rXwIlYS9HIAEBAPJjw.png)