File size: 5,428 Bytes
98d7826
 
 
 
 
 
 
 
cdb1312
 
 
98d7826
 
 
 
 
 
 
cdb1312
98d7826
87cc063
 
 
 
 
 
cdb1312
 
9ac21ec
98d7826
 
 
 
 
 
 
cdb1312
 
98d7826
 
 
 
 
 
 
cdb1312
 
 
 
 
 
 
 
 
 
98d7826
 
 
 
 
 
 
4fac051
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
license: apache-2.0
base_model: google/electra-small-discriminator
tags:
- generated_from_keras_callback
model-index:
- name: nguyennghia0902/electra-small-discriminator_0.0005_32
  results: []
language:
- vi
pipeline_tag: question-answering
---

<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->

# nguyennghia0902/electra-small-discriminator_0.0005_32

This model is a fine-tuned version of [google/electra-small-discriminator](https://huggingface.co/google/electra-small-discriminator) on [Vietnamese dataset](https://www.kaggle.com/datasets/duyminhnguyentran/csc15105).
It achieves the following results on the evaluation set:
- Train Loss: 0.9748
- Train End Logits Accuracy: 0.7441
- Train Start Logits Accuracy: 0.7181
- Validation Loss: 0.5570
- Validation End Logits Accuracy: 0.8476
- Validation Start Logits Accuracy: 0.8405
- Validation Matching Accuracy: 0.7642
- Epoch: 10
- Train time: 13988.27401 seconds ~ 3.8855 hours


## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- Learning rate: 5e-4
- Batch size: 32
- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0005, 'decay_steps': 15630, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
- training_precision: float32

### Training results

| Train Loss | Train End Logits Accuracy | Train Start Logits Accuracy | Validation Loss | Validation End Logits Accuracy | Validation Start Logits Accuracy | Epoch |
|:----------:|:-------------------------:|:---------------------------:|:---------------:|:------------------------------:|:--------------------------------:|:-----:|
| 3.4201     | 0.2553                    | 0.2310                      | 2.6430          | 0.3942                         | 0.3704                           | 1     |
| 2.7588     | 0.3762                    | 0.3462                      | 2.2758          | 0.4660                         | 0.4482                           | 2     |
| 2.4695     | 0.4323                    | 0.3983                      | 2.0056          | 0.5211                         | 0.5006                           | 3     |
| 2.2478     | 0.4745                    | 0.4407                      | 1.7412          | 0.5763                         | 0.5595                           | 4     |
| 2.0321     | 0.5186                    | 0.4864                      | 1.5126          | 0.6289                         | 0.6095                           | 5     |
| 1.8186     | 0.5614                    | 0.5319                      | 1.2839          | 0.6719                         | 0.6647                           | 6     |
| 1.6012     | 0.6060                    | 0.5760                      | 1.0431          | 0.7322                         | 0.7264                           | 7     |
| 1.3677     | 0.6561                    | 0.6257                      | 0.8193          | 0.7857                         | 0.7770                           | 8     |
| 1.1450     | 0.7023                    | 0.6765                      | 0.6373          | 0.8275                         | 0.8215                           | 9     |
| 0.9748     | 0.7441                    | 0.7181                      | 0.5570          | 0.8476                         | 0.8405                           | 10    |


### Framework versions

- Transformers 4.39.3
- TensorFlow 2.15.0
- Datasets 2.18.0
- Tokenizers 0.15.2

## How to use?
```python
from transformers import ElectraTokenizerFast, TFElectraForQuestionAnswering

model_hf = "nguyennghia0902/electra-small-discriminator_0.0005_32"
tokenizer = ElectraTokenizerFast.from_pretrained(model_hf)
reload_model = TFElectraForQuestionAnswering.from_pretrained(model_hf)

question = "Ký túc xá Đại học Quốc gia Thành phố Hồ Chí Minh bao gồm có bao nhiêu khu?"
context = "Ký túc xá Đại học Quốc gia Thành phố Hồ Chí Minh (Ký túc xá ĐHQG-TPHCM) là hệ thống ký túc xá xây tại Khu đô thị Đại học Quốc gia Thành phố Hồ Chí Minh (còn gọi với tên phổ biến: Khu đô thị ĐHQG-HCM hay Làng Đại học Thủ Đức). Ký túc xá ĐHQG-TPHCM gồm có 02 khu: A và B. Địa chỉ: Đường Tạ Quang Bửu, Khu phố 6, phường Linh Trung, thành phố Thủ Đức, Thành phố Hồ Chí Minh, điện thoại: 1900 05 55 59 (111). "

inputs = tokenizer(question, context, return_offsets_mapping=True, return_tensors="tf",m ax_length=512, truncation=True)
offset_mapping = inputs.pop("offset_mapping")
outputs = reload_model(**inputs)
answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])
start_char = offset_mapping[0][answer_start_index][0]
end_char = offset_mapping[0][answer_end_index][1]
predicted_answer_text = context[start_char:end_char]

print(predicted_answer_text)
```