Ngit commited on
Commit
55bc070
·
verified ·
1 Parent(s): 9dfcddd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -6,7 +6,7 @@ language:
6
  # Text Classification Toxicity
7
 
8
  This is a quantized onnx model and is a fined-tuned version of [nreimers/MiniLMv2-L6-H384-distilled-from-BERT-Large](https://huggingface.co/nreimers/MiniLMv2-L6-H384-distilled-from-BERT-Large) on the on the [Jigsaw 1st Kaggle competition](https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge) dataset using [unitary/toxic-bert](https://huggingface.co/unitary/toxic-bert) as teacher model.
9
- The original model can be found [here](https://huggingface.co/minuva/MiniLMv2-toxic-jijgsaw)
10
 
11
 
12
  # Usage
@@ -16,7 +16,7 @@ The original model can be found [here](https://huggingface.co/minuva/MiniLMv2-to
16
  ```bash
17
  pip install tokenizers
18
  pip install onnxruntime
19
- git clone https://huggingface.co/minuva/MiniLMv2-toxic-jijgsaw-onnx
20
  ```
21
 
22
 
@@ -31,7 +31,7 @@ from tokenizers import Tokenizer
31
  from onnxruntime import InferenceSession
32
 
33
 
34
- model_name = "minuva/MiniLMv2-toxic-jijgsaw-onnx"
35
  tokenizer = Tokenizer.from_pretrained(model_name)
36
  tokenizer.enable_padding(
37
  pad_token="<pad>",
@@ -42,9 +42,9 @@ batch_size = 16
42
 
43
  texts = ["This is pure trash",]
44
  outputs = []
45
- model = InferenceSession("MiniLMv2-toxic-jijgsaw-onnx/model_optimized_quantized.onnx", providers=['CUDAExecutionProvider'])
46
 
47
- with open(os.path.join("MiniLMv2-toxic-jijgsaw-onnx", "config.json"), "r") as f:
48
  config = json.load(f)
49
 
50
  output_names = [output.name for output in model.get_outputs()]
@@ -115,7 +115,7 @@ The following hyperparameters were used during training:
115
 
116
  | Teacher (params) | Student (params) | Set (metric) | Score (teacher) | Score (student) |
117
  |--------------------|-------------|----------|--------| --------|
118
- | unitary/toxic-bert (110M) | MiniLMv2-toxic-jijgsaw-onnx (23M) | Test (ROC_AUC) | 0.98636 | 0.98130 |
119
 
120
  # Deployment
121
 
 
6
  # Text Classification Toxicity
7
 
8
  This is a quantized onnx model and is a fined-tuned version of [nreimers/MiniLMv2-L6-H384-distilled-from-BERT-Large](https://huggingface.co/nreimers/MiniLMv2-L6-H384-distilled-from-BERT-Large) on the on the [Jigsaw 1st Kaggle competition](https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge) dataset using [unitary/toxic-bert](https://huggingface.co/unitary/toxic-bert) as teacher model.
9
+ The original model can be found [here](https://huggingface.co/minuva/MiniLMv2-toxic-jigsaw)
10
 
11
 
12
  # Usage
 
16
  ```bash
17
  pip install tokenizers
18
  pip install onnxruntime
19
+ git clone https://huggingface.co/minuva/MiniLMv2-toxic-jigsaw-onnx
20
  ```
21
 
22
 
 
31
  from onnxruntime import InferenceSession
32
 
33
 
34
+ model_name = "minuva/MiniLMv2-toxic-jigsaw-onnx"
35
  tokenizer = Tokenizer.from_pretrained(model_name)
36
  tokenizer.enable_padding(
37
  pad_token="<pad>",
 
42
 
43
  texts = ["This is pure trash",]
44
  outputs = []
45
+ model = InferenceSession("MiniLMv2-toxic-jigsaw-onnx/model_optimized_quantized.onnx", providers=['CUDAExecutionProvider'])
46
 
47
+ with open(os.path.join("MiniLMv2-toxic-jigsaw-onnx", "config.json"), "r") as f:
48
  config = json.load(f)
49
 
50
  output_names = [output.name for output in model.get_outputs()]
 
115
 
116
  | Teacher (params) | Student (params) | Set (metric) | Score (teacher) | Score (student) |
117
  |--------------------|-------------|----------|--------| --------|
118
+ | unitary/toxic-bert (110M) | MiniLMv2-toxic-jigsaw-onnx (23M) | Test (ROC_AUC) | 0.98636 | 0.98130 |
119
 
120
  # Deployment
121