malvika2003's picture
Upload folder using huggingface_hub
db5855f verified

A newer version of the Gradio SDK is available: 5.23.3

Upgrade

Optimizing PyTorch models with Neural Network Compression Framework of OpenVINO™ by 8-bit quantization.

This tutorial demonstrates how to use NNCF 8-bit sparse quantization to optimize the PyTorch model for inference with OpenVINO Toolkit. For more advanced usage, refer to these examples.

This notebook is based on 'ImageNet training in PyTorch' example. This notebook uses a ResNet-50 model with the ImageNet dataset.

Notebook Contents

This tutorial consists of the following steps:

  • Transforming the original dense FP32 model to sparse INT8
  • Using fine-tuning to restore the accuracy.
  • Exporting optimized and original models to OpenVINO
  • Measuring and comparing the performance of the models.

Installation Instructions

This is a self-contained example that relies solely on its own code and accompanying config.json file.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.