File size: 3,040 Bytes
d4ce69e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
pipeline_tag: token-classification
datasets:
- conll2003
metrics:
- precision
- recall
- f1
- accuracy
tags:
- distilbert
---
**task**: `token-classification`
**Backend:** `sagemaker-training`
**Backend args:** `{'instance_type': 'ml.g4dn.2xlarge', 'supported_instructions': None}`
**Number of evaluation samples:** `All dataset`
Fixed parameters:
* **model_name_or_path**: `elastic/distilbert-base-uncased-finetuned-conll03-english`
* **dataset**:
* **path**: `conll2003`
* **eval_split**: `validation`
* **data_keys**: `{'primary': 'tokens'}`
* **ref_keys**: `['ner_tags']`
* **calibration_split**: `train`
* **quantization_approach**: `static`
* **operators_to_quantize**: `['Add', 'MatMul']`
* **per_channel**: `False`
* **calibration**:
* **method**: `minmax`
* **num_calibration_samples**: `100`
* **framework**: `onnxruntime`
* **framework_args**:
* **opset**: `11`
* **optimization_level**: `1`
* **aware_training**: `False`
Benchmarked parameters:
* **node_exclusion**: `[]`, `['layernorm', 'gelu', 'residual', 'gather', 'softmax']`
# Evaluation
## Non-time metrics
| node_exclusion | | precision (original) | precision (optimized) | | recall (original) | recall (optimized) | | f1 (original) | f1 (optimized) | | accuracy (original) | accuracy (optimized) |
| :------------------------------------------------------: | :-: | :------------------: | :-------------------: | :-: | :---------------: | :----------------: | :-: | :-----------: | :------------: | :-: | :-----------------: | :------------------: |
| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 0.936 | 0.904 | \| | 0.944 | 0.921 | \| | 0.940 | 0.912 | \| | 0.988 | 0.984 |
| `[]` | \| | 0.936 | 0.065 | \| | 0.944 | 0.243 | \| | 0.940 | 0.103 | \| | 0.988 | 0.357 |
## Time metrics
Time benchmarks were run for 15 seconds per config.
Below, time metrics for batch size = 4, input length = 64.
| node_exclusion | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) |
| :------------------------------------------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
| `['layernorm', 'gelu', 'residual', 'gather', 'softmax']` | \| | 103.46 | 53.77 | \| | 9.67 | 18.60 |
| `[]` | \| | 90.62 | 65.86 | \| | 11.07 | 15.20 |
|