ParitKansal commited on
Commit
8ddae19
·
verified ·
1 Parent(s): 7bb9b18

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -24
README.md CHANGED
@@ -12,45 +12,57 @@ metrics:
12
  model-index:
13
  - name: mt5-small-Context-Based-Chat-Summary-Plus
14
  results: []
 
 
 
 
 
15
  ---
16
 
17
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
  should probably proofread and complete it, then remove this comment. -->
19
 
 
 
 
 
20
  # mt5-small-Context-Based-Chat-Summary-Plus
21
 
22
- This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
23
- It achieves the following results on the evaluation set:
24
- - Loss: 1.7287
25
- - Rouge1: 52.033
26
- - Rouge2: 28.5069
27
- - Rougel: 47.9951
28
- - Rougelsum: 47.994
29
 
30
  ## Model description
31
 
32
- More information needed
33
 
34
  ## Intended uses & limitations
35
 
36
- More information needed
 
 
 
 
 
 
 
37
 
38
  ## Training and evaluation data
39
 
40
- More information needed
 
 
 
 
41
 
42
  ## Training procedure
43
 
44
  ### Training hyperparameters
45
-
46
- The following hyperparameters were used during training:
47
- - learning_rate: 5.6e-05
48
- - train_batch_size: 64
49
- - eval_batch_size: 64
50
- - seed: 42
51
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
- - lr_scheduler_type: linear
53
- - num_epochs: 6
54
 
55
  ### Training results
56
 
@@ -62,10 +74,33 @@ The following hyperparameters were used during training:
62
  | 2.1912 | 5.0 | 6920 | 1.7372 | 51.912 | 28.3549 | 47.8763 | 47.8849 |
63
  | 2.1537 | 6.0 | 8304 | 1.7287 | 52.033 | 28.5069 | 47.9951 | 47.994 |
64
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
- ### Framework versions
 
 
 
 
 
67
 
68
- - Transformers 4.47.1
69
- - Pytorch 2.5.1+cu121
70
- - Datasets 3.2.0
71
- - Tokenizers 0.21.0
 
12
  model-index:
13
  - name: mt5-small-Context-Based-Chat-Summary-Plus
14
  results: []
15
+ datasets:
16
+ - prithivMLmods/Context-Based-Chat-Summary-Plus
17
+ language:
18
+ - en
19
+ pipeline_tag: summarization
20
  ---
21
 
22
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
23
  should probably proofread and complete it, then remove this comment. -->
24
 
25
+ Here’s a modified version of your original model description to reflect the code you provided:
26
+
27
+ ---
28
+
29
  # mt5-small-Context-Based-Chat-Summary-Plus
30
 
31
+ This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the [prithivMLmods/Context-Based-Chat-Summary-Plus](https://huggingface.co/datasets/prithivMLmods/Context-Based-Chat-Summary-Plus) dataset. It performs well on context-based summarization tasks, leveraging the mT5 model's multilingual capabilities.
 
 
 
 
 
 
32
 
33
  ## Model description
34
 
35
+ This model is designed for summarizing context-based chat data. The model was trained to generate summaries based on conversations and text-based inputs. It uses a seq2seq architecture, fine-tuned to produce accurate and coherent summaries.
36
 
37
  ## Intended uses & limitations
38
 
39
+ ### Intended Uses:
40
+ - Contextual text summarization
41
+ - Summarizing chat logs, meeting transcripts, or conversational exchanges
42
+ - Extracting key points or highlights from a larger body of text
43
+
44
+ ### Limitations:
45
+ - May struggle with highly specialized or domain-specific language
46
+ - Could produce summaries that may require further refinement for nuanced or highly technical content
47
 
48
  ## Training and evaluation data
49
 
50
+ The model was trained on the [prithivMLmods/Context-Based-Chat-Summary-Plus](https://huggingface.co/datasets/prithivMLmods/Context-Based-Chat-Summary-Plus) dataset, which consists of conversational and text data, with summaries representing the key elements of the content.
51
+
52
+ ### Data preprocessing:
53
+ - Filters were applied to exclude entries with short headlines (less than 3 words) or text with fewer than 30 words.
54
+ - The dataset was split into 90% training and 10% testing.
55
 
56
  ## Training procedure
57
 
58
  ### Training hyperparameters
59
+ - **Learning Rate**: 5.6e-5
60
+ - **Train Batch Size**: 64
61
+ - **Eval Batch Size**: 64
62
+ - **Epochs**: 6 (initially 4 epochs, followed by an additional 2 epochs)
63
+ - **Optimizer**: AdamW with betas=(0.9,0.999) and epsilon=1e-08
64
+ - **Scheduler**: Linear learning rate scheduler
65
+ - **Logging**: Logging steps were set to show every epoch.
 
 
66
 
67
  ### Training results
68
 
 
74
  | 2.1912 | 5.0 | 6920 | 1.7372 | 51.912 | 28.3549 | 47.8763 | 47.8849 |
75
  | 2.1537 | 6.0 | 8304 | 1.7287 | 52.033 | 28.5069 | 47.9951 | 47.994 |
76
 
77
+ ### Framework versions:
78
+ - **Transformers**: 4.47.1
79
+ - **Pytorch**: 2.5.1+cu121
80
+ - **Datasets**: 3.2.0
81
+ - **Tokenizers**: 0.21.0
82
+
83
+ ## Evaluation
84
+
85
+ The model was evaluated using the ROUGE metric, achieving the following scores on the validation set:
86
+ - **Rouge-1**: 52.033
87
+ - **Rouge-2**: 28.5069
88
+ - **Rouge-L**: 47.9951
89
+ - **Rouge-Lsum**: 47.994
90
+
91
+ ## Final Results
92
+ After 6 epochs of training, the model was pushed to the Hugging Face Hub with the identifier **ParitKansal/mt5-small-Context-Based-Chat-Summary-Plus**. You can use it for summarization tasks directly.
93
+
94
+ ### Example Usage:
95
+
96
+ ```python
97
+ from transformers import pipeline
98
 
99
+ hub_model_id = "ParitKansal/mt5-small-Context-Based-Chat-Summary-Plus"
100
+ summarizer = pipeline("summarization", model=hub_model_id)
101
+ text = "Snehlata Shrivastava has been appointed as the Secretary General of the Lok Sabha, a notification issued by the Secretariat of the lower house said. She is the first woman to be elected for the post and will assume charge from December 1. She was earlier the Joint Secretary in the Law Ministry and has also worked in the Finance Ministry."
102
+ summary = summarizer(text)[0]['summary_text']
103
+ print("Predicted Summary: ", summary)
104
+ ```
105
 
106
+ ---