amannagrawall002 commited on
Commit
978b38c
·
1 Parent(s): 8e5e349

Updated Auto-generated model card.

Browse files
Files changed (1) hide show
  1. README.md +36 -88
README.md CHANGED
@@ -4,18 +4,32 @@ language: en
4
  tags:
5
  - bert-finetuned-mrpc
6
  - sequence-classification
7
- license: unknown
 
8
  ---
9
 
10
  # Bert-finetuned-mrpc Fine-tuned for Sequence classification
11
 
12
  This model is a fine-tuned version of [bert-finetuned-mrpc](https://huggingface.co/bert-finetuned-mrpc) for sequence classification tasks.
13
 
 
 
 
 
 
 
 
 
 
 
 
14
  ## Model description
15
 
 
 
16
  - Model architecture: BertForSequenceClassification
17
  - Task: sequence-classification
18
- - Training dataset: bert-finetuned-mrpc
19
  - Number of parameters: 109,483,778
20
  - Sequence length: 512
21
  - Vocab size: 30522
@@ -23,97 +37,31 @@ This model is a fine-tuned version of [bert-finetuned-mrpc](https://huggingface.
23
  - Number of attention heads: 12
24
  - Number of hidden layers: 12
25
 
26
- ## Intended uses & limitations
 
 
27
 
28
- This model is intended for sequence classification tasks. It has been fine-tuned on a specific dataset, so its performance may vary on different datasets or domains.
 
 
 
 
 
 
 
 
29
 
30
  ## Training procedure
31
 
32
- The model was fine-tuned using the following hyperparameters:
33
- {
34
- "return_dict": true,
35
- "output_hidden_states": false,
36
- "output_attentions": false,
37
- "torchscript": false,
38
- "torch_dtype": "float32",
39
- "use_bfloat16": false,
40
- "tf_legacy_loss": false,
41
- "pruned_heads": {},
42
- "tie_word_embeddings": true,
43
- "chunk_size_feed_forward": 0,
44
- "is_encoder_decoder": false,
45
- "is_decoder": false,
46
- "cross_attention_hidden_size": null,
47
- "add_cross_attention": false,
48
- "tie_encoder_decoder": false,
49
- "max_length": 20,
50
- "min_length": 0,
51
- "do_sample": false,
52
- "early_stopping": false,
53
- "num_beams": 1,
54
- "num_beam_groups": 1,
55
- "diversity_penalty": 0.0,
56
- "temperature": 1.0,
57
- "top_k": 50,
58
- "top_p": 1.0,
59
- "typical_p": 1.0,
60
- "repetition_penalty": 1.0,
61
- "length_penalty": 1.0,
62
- "no_repeat_ngram_size": 0,
63
- "encoder_no_repeat_ngram_size": 0,
64
- "bad_words_ids": null,
65
- "num_return_sequences": 1,
66
- "output_scores": false,
67
- "return_dict_in_generate": false,
68
- "forced_bos_token_id": null,
69
- "forced_eos_token_id": null,
70
- "remove_invalid_values": false,
71
- "exponential_decay_length_penalty": null,
72
- "suppress_tokens": null,
73
- "begin_suppress_tokens": null,
74
- "architectures": [
75
- "BertForSequenceClassification"
76
- ],
77
- "finetuning_task": null,
78
- "id2label": {
79
- "0": "LABEL_0",
80
- "1": "LABEL_1"
81
- },
82
- "label2id": {
83
- "LABEL_0": 0,
84
- "LABEL_1": 1
85
- },
86
- "tokenizer_class": null,
87
- "prefix": null,
88
- "bos_token_id": null,
89
- "pad_token_id": 0,
90
- "eos_token_id": null,
91
- "sep_token_id": null,
92
- "decoder_start_token_id": null,
93
- "task_specific_params": null,
94
- "problem_type": "single_label_classification",
95
- "_name_or_path": "bert-finetuned-mrpc",
96
- "transformers_version": "4.38.1",
97
- "gradient_checkpointing": false,
98
- "model_type": "bert",
99
- "vocab_size": 30522,
100
- "hidden_size": 768,
101
- "num_hidden_layers": 12,
102
- "num_attention_heads": 12,
103
- "hidden_act": "gelu",
104
- "intermediate_size": 3072,
105
- "hidden_dropout_prob": 0.1,
106
- "attention_probs_dropout_prob": 0.1,
107
- "max_position_embeddings": 512,
108
- "type_vocab_size": 2,
109
- "initializer_range": 0.02,
110
- "layer_norm_eps": 1e-12,
111
- "position_embedding_type": "absolute",
112
- "use_cache": true,
113
- "classifier_dropout": null
114
- }
115
 
116
  ## Evaluation results
117
 
118
- [Evaluation results to be added]
119
 
 
4
  tags:
5
  - bert-finetuned-mrpc
6
  - sequence-classification
7
+ license: Apache-2.0
8
+ input : pair of sentence.
9
  ---
10
 
11
  # Bert-finetuned-mrpc Fine-tuned for Sequence classification
12
 
13
  This model is a fine-tuned version of [bert-finetuned-mrpc](https://huggingface.co/bert-finetuned-mrpc) for sequence classification tasks.
14
 
15
+ ## Model Description
16
+
17
+
18
+ ## Dataset
19
+
20
+ - **Name**: MRPC (Microsoft Research Paraphrase Corpus)
21
+
22
+ - **Description**: The MRPC dataset consists of sentence pairs automatically extracted from online news sources, with human annotations for whether the sentences in the pair are semantically equivalent.
23
+
24
+ - **Source**: The dataset is part of the GLUE benchmark.
25
+
26
  ## Model description
27
 
28
+ This model is a fine-tuned version of BERT-base-uncased, specifically trained to determine if two sentences are paraphrases of each other. The model outputs 1 if the sentences are equivalent and 0 if they are not.
29
+
30
  - Model architecture: BertForSequenceClassification
31
  - Task: sequence-classification
32
+ - Training dataset: glue mrpc dataset
33
  - Number of parameters: 109,483,778
34
  - Sequence length: 512
35
  - Vocab size: 30522
 
37
  - Number of attention heads: 12
38
  - Number of hidden layers: 12
39
 
40
+ ## Intended Uses & Limitations
41
+
42
+ **Intended Uses**
43
 
44
+ - Paraphrase Detection: This model can be used to determine if two sentences are paraphrases of each other, which is useful in applications like duplicate question detection in forums, semantic search, and text summarization.
45
+
46
+ - Educational Purposes: Can be used for educational purposes to demonstrate fine-tuning of transformer models on specific tasks.
47
+
48
+ **Limitations**
49
+
50
+ - Dataset Bias: The MRPC dataset contains sentence pairs from specific news sources, which might introduce bias. The model might not perform well on text from other domains.
51
+
52
+ - Context Limitations: The model evaluates sentences pairwise without considering broader context, which might lead to incorrect paraphrase detections in complex contexts.
53
 
54
  ## Training procedure
55
 
56
+ - Optimizer: AdamW
57
+
58
+ - Learning Rate: 5e-5
59
+
60
+ - Epochs: 3
61
+
62
+ - Batch Size: 8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
  ## Evaluation results
65
 
66
+ {'accuracy': 0.8504901960784313, 'f1': 0.8942807625649913}
67