moshew commited on
Commit
64607b6
·
1 Parent(s): 42c6f09

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -2
README.md CHANGED
@@ -4,7 +4,10 @@ library_name: keras
4
 
5
  ## Model description
6
 
7
- More information needed
 
 
 
8
 
9
  ## Intended uses & limitations
10
 
@@ -12,7 +15,66 @@ More information needed
12
 
13
  ## Training and evaluation data
14
 
15
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  ## Training procedure
18
 
 
4
 
5
  ## Model description
6
 
7
+ >x100 smaller size vs. DistilBERT with less than 0.5 drop in accuracy evaluated on SST-2 test data
8
+
9
+ DistilBERT - 92.2, 67M parameters
10
+ DistlBiLSTM - 91.7 65.8K paramter
11
 
12
  ## Intended uses & limitations
13
 
 
15
 
16
  ## Training and evaluation data
17
 
18
+ Here is the evaluation code
19
+ from datasets import load_dataset
20
+ import numpy as np
21
+ from sklearn.metrics import accuracy_score
22
+
23
+ from keras.preprocessing.text import Tokenizer
24
+ from keras.utils import pad_sequences
25
+ import tensorflow as tf
26
+ from huggingface_hub import from_pretrained_keras
27
+
28
+ from datasets import load_dataset
29
+ sst2 = load_dataset("SetFit/sst2")
30
+ augmented_sst2_dataset = load_dataset("jmamou/augmented-glue-sst2")
31
+
32
+ pad_type = 'post'
33
+ trunc_type = 'post'
34
+
35
+ #Tokenize our training data
36
+ tokenizer = Tokenizer(num_words=MAX_NUM_WORDS)
37
+ tokenizer.fit_on_texts(augmented_sst2_dataset['train']['sentence'])
38
+
39
+ #Encode training data sentences into sequences
40
+ test_sequences = tokenizer.texts_to_sequences(sst2['test']['text'])
41
+
42
+ #Pad the training sequences
43
+ test_padded = pad_sequences(test_sequences, padding=pad_type, truncating=trunc_type, maxlen=MAX_LEN)
44
+
45
+ reloaded_model = from_pretrained_keras('moshew/distilbilstm-finetuned-sst-2-english')
46
+
47
+ pred=reloaded_model.predict(test_padded)
48
+ pred_bin = np.argmax(pred,1)
49
+ accuracy_score(pred_bin, sst2['test']['label'])
50
+
51
+ 0.9176276771004942
52
+
53
+ reloaded_model.summary()
54
+
55
+ Model: "model"
56
+ _________________________________________________________________
57
+ Layer (type) Output Shape Param #
58
+ =================================================================
59
+ input_1 (InputLayer) [(None, 64)] 0
60
+
61
+ embedding (Embedding) (None, 64, 50) 500000
62
+
63
+ bidirectional (Bidirectiona (None, 64, 128) 58880
64
+ l)
65
+
66
+ bidirectional_1 (Bidirectio (None, 128) 98816
67
+ nal)
68
+
69
+ dropout (Dropout) (None, 128) 0
70
+
71
+ dense (Dense) (None, 2) 258
72
+
73
+ =================================================================
74
+ Total params: 657,954
75
+ Trainable params: 657,954
76
+ Non-trainable params: 0
77
+
78
 
79
  ## Training procedure
80