Upload TFBertForQuestionAnswering

Browse files

Files changed (3) hide show

README.md +41 -112
config.json +2 -2
tf_model.h5 +3 -0

README.md CHANGED Viewed

@@ -1,117 +1,46 @@
-## ParsBert For Question Answering Task
-ParsBERT is a monolingual language model based on Google’s BERT architecture with the same configurations as BERT-Base.
-In this project I fine-tune ParsBert for extractive question answering task on PQuAD dataset.
-Paper presenting ParsBERT: [arXiv:2005.12515](https://arxiv.org/abs/2005.12515)
-Paper presenting PQuAD dataset: [arXiv:2202.06219](https://arxiv.org/abs/2202.06219)
----
-## Introduction
-This model is fine-tuned on PQuAD Train set and is easily ready to use.
-Its very long training time encouraged me to publish this model in order to make life easier for those who need.
-## Hyperparameters
-I set batch size to 32 due to the limitations of GPU memory in Google Colab.
-```
-batch_size = 32
-n_epochs = 2
-base_LM_model = 'HooshvareLab/bert-fa-base-uncased'
-max_seq_len = 256
-learning_rate = 5e-5
-```
-## Performance
-Evaluated on the PQuAD Persian test set with the [official PQuAD link](https://huggingface.co/datasets/newsha/PQuAD).
-The model started to get overfitted after 2 epochs with dropout rates between 0.01 to 0.1 and
-stopped to learn with rates bigger than 0.1.
-[Our XLM-Roberta](https://huggingface.co/pedramyazdipoor/persian_xlm_roberta_large) outperforms our ParsBert on PQuAD dataset, but the former is more than 3 times bigger than the latter one; so comparing these two is not fair.
-### Question Answering On Test Set of PQuAD Dataset
-|      Metric      | Our XLM-Roberta Large | Our ParsBert  |
-|:----------------:|:---------------------:|:-------------:|
-| Exact Match      |   66.56*              | 47.44         |
-|      F1          |   87.31*              | 81.96         |
-## How to use
-## Pytorch
-```python
-from transformers import AutoTokenizer, AutoModelForQuestionAnswering, AutoConfig
-tokenizer = AutoTokenizer.from_pretrained('pedramyazdipoor/parsbert_question_answering_PQuAD')
-model = AutoModelForQuestionAnswering.from_pretrained('pedramyazdipoor/parsbert_question_answering_PQuAD')
-config = AutoConfig.from_pretrained('pedramyazdipoor/parsbert_question_answering_PQuAD')
-```
-## Inference
-There are some considerations for inference:
-1) Start index of answer must be smaller than end index.
-2) The span of answer must be within the context.
-3) The selected span must be the most probable choice among N pairs of candidates.
-```python
-def generate_indexes(start_logits, end_logits, N, max_index):
-  output_start = start_logits
-  output_end = end_logits
-  start_indexes = np.arange(len(start_logits))
-  start_probs = output_start
-  list_start = dict(zip(start_indexes, start_probs.tolist()))
-  end_indexes = np.arange(len(end_logits))
-  end_probs = output_end
-  list_end = dict(zip(end_indexes, end_probs.tolist()))
-  sorted_start_list = sorted(list_start.items(), key=lambda x: x[1], reverse=True) #Descending sort by probability
-  sorted_end_list = sorted(list_end.items(), key=lambda x: x[1], reverse=True)
-  final_start_idx, final_end_idx = [[] for l in range(2)]
-  start_idx, end_idx, prob = 0, 0, (start_probs.tolist()[0] + end_probs.tolist()[0])
-  for a in range(0,N):
-    for b in range(0,N):
-      if (sorted_start_list[a][1] + sorted_end_list[b][1]) > prob :
-        if (sorted_start_list[a][0] <= sorted_end_list[b][0]) and (sorted_end_list[a][0] < max_index) :
-          prob = sorted_start_list[a][1] + sorted_end_list[b][1]
-          start_idx = sorted_start_list[a][0]
-          end_idx = sorted_end_list[b][0]
-  final_start_idx.append(start_idx)
-  final_end_idx.append(end_idx)
-  return final_start_idx[0], final_end_idx[0]
-```
-```python
-device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
-model.eval().to(device)
-text = 'اسمم پدرامه.'
-question = 'اسمم چیه؟'
-print(tokenizer.tokenize(text + question))
-encoding = tokenizer(text,question,add_special_tokens = True,
-                     return_token_type_ids = True,
-                     return_tensors = 'pt',
-                     padding = True,
-                     return_offsets_mapping = True,
-                     truncation = 'only_first',
-                     max_length = 32)
-out = model(encoding['input_ids'].to(device),encoding['attention_mask'].to(device), encoding['token_type_ids'].to(device))
-#we had to change some pieces of code to make it compatible with one answer generation at a time.
-#you can initialize max_index in generate_indexes() to put force on tokens being chosen to be within the context(end index must be less than seperator token).
-start_index, end_index = generate_indexes(out['start_logits'][0], out['end_logits'][0], 5, 0)
-print(tokenizer.tokenize(text + question)[start_index:end_index+1])
->>> ['اسمم', 'پدرام', '##ه', '.', 'اسمم', 'چیه', '؟']
->>> ['پدرام']
-```
-## Acknowledgments
-It would be never possible to train this model without the great job done by [HooshvareLab](https://huggingface.co/HooshvareLab/bert-base-parsbert-uncased).
-We also express our gratitude to the [Newsha Shahbodaghkhan](https://huggingface.co/datasets/newsha/PQuAD/tree/main) for facilitating dataset gathering.
-## Contributors
-- Pedram Yazdipoor : [Linkedin](https://www.linkedin.com/in/pedram-yazdipour/)
-## Releases
-### Release v0.1 (Sep 18, 2022)
-This is the First version of our ParsBert_For_Question_Answering_PQuAD.

+---
+tags:
+- generated_from_keras_callback
+model-index:
+- name: parsbert_question_answering_PQuAD
+  results: []
+---
+<!-- This model card has been generated automatically according to the information Keras had access to. You should
+probably proofread and complete it, then remove this comment. -->
+# parsbert_question_answering_PQuAD
+This model is a fine-tuned version of [pedramyazdipoor/parsbert_question_answering_PQuAD](https://huggingface.co/pedramyazdipoor/parsbert_question_answering_PQuAD) on an unknown dataset.
+It achieves the following results on the evaluation set:
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- optimizer: None
+- training_precision: float32
+### Training results
+### Framework versions
+- Transformers 4.22.1
+- TensorFlow 2.8.2
+- Tokenizers 0.12.1

config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
-  "_name_or_path": "HooshvareLab/bert-base-parsbert-uncased",
   "architectures": [
-    "QAModel2"
   ],
   "attention_probs_dropout_prob": 0.1,
   "classifier_dropout": null,

 {
+  "_name_or_path": "pedramyazdipoor/parsbert_question_answering_PQuAD",
   "architectures": [
+    "BertForQuestionAnswering"
   ],
   "attention_probs_dropout_prob": 0.1,
   "classifier_dropout": null,

tf_model.h5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ed169e2a4dfbc1b9ba785a83ee799bf7b443c10ce6a37cf59365650471f87697
+size 649278480