File size: 1,281 Bytes
db5855f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Post-Training Quantization of PyTorch models with NNCF

This tutorial demonstrates how to use [NNCF](https://github.com/openvinotoolkit/nncf) 8-bit quantization in
post-training mode (without the fine-tuning pipeline) to optimize a [PyTorch](https://pytorch.org/) model
for high-speed inference via [OpenVINO Toolkit](https://docs.openvino.ai/). For more advanced NNCF
usage, refer to these [examples](https://github.com/openvinotoolkit/nncf/tree/develop/examples).

To speed up download and validation, this tutorial uses a pre-trained [ResNet-50](https://arxiv.org/abs/1512.03385)
model on the [Tiny ImageNet](http://cs231n.stanford.edu/reports/2015/pdfs/leonyao_final.pdf) dataset.

## Notebook contents

The tutorial consists of the following steps:

* Evaluating the original model.
* Transforming the original `FP32` model to `INT8`.
* Exporting optimized and original models to ONNX and then to OpenVINO IR.
* Comparing performance of the obtained `FP32` and `INT8` models.

## Installation Instructions

This is a self-contained example that relies solely on its own code.</br>
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to [Installation Guide](../../README.md).