File size: 1,885 Bytes
7777cef
3b8ff63
 
7777cef
3b8ff63
 
 
0912fc2
8e4dd31
0912fc2
d91f715
3b8ff63
e9f7319
3b8ff63
 
 
 
 
 
 
e9f7319
3b8ff63
 
 
 
 
e9f7319
3b8ff63
e9f7319
7777cef
3b8ff63
 
d91f715
 
 
3b8ff63
 
 
 
 
 
 
d91f715
3b8ff63
 
 
 
 
 
d91f715
3b8ff63
 
47326f3
a1e905c
 
47326f3
3b8ff63
d91f715
 
 
 
 
 
 
 
 
 
 
1282d76
 
d91f715
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
language:
- en
license: mit
tags:
- text-classfication
- int8
- Intel® Neural Compressor
- neural-compressor
- PostTrainingDynamic
- onnx
datasets:
- nyu-mll/glue
metrics:
- f1
model-index:
- name: camembert-base-mrpc-int8-dynamic
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: GLUE MRPC
      type: glue
      args: mrpc
    metrics:
    - type: f1
      value: 0.8842832469775476
      name: F1
---
# INT8 camembert-base-mrpc

##  Post-training dynamic quantization

### PyTorch

This is an INT8  PyTorch model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor). 

The original fp32 model comes from the fine-tuned model [camembert-base-mrpc](https://huggingface.co/Intel/camembert-base-mrpc).

The linear module **roberta.encoder.layer.6.attention.self.query** falls back to fp32 to meet the 1% relative accuracy loss.

#### Test result

|   |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.8843|0.8928|
| **Model size (MB)**  |180|422|

#### Load with Intel® Neural Compressor:

```python
from optimum.intel import INCModelForSequenceClassification

model_id = "Intel/camembert-base-mrpc-int8-dynamic"
int8_model = INCModelForSequenceClassification.from_pretrained(model_id)
```

### ONNX

This is an INT8 ONNX model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor).

The original fp32 model comes from the fine-tuned model [camembert-base-mrpc](https://huggingface.co/Intel/camembert-base-mrpc).

#### Test result

|   |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.8819|0.8928|
| **Model size (MB)**  |113|423|


#### Load ONNX model:

```python
from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained('Intel/camembert-base-mrpc-int8-dynamic')
```