Update README.md
Browse files
README.md
CHANGED
@@ -12,45 +12,57 @@ metrics:
|
|
12 |
model-index:
|
13 |
- name: mt5-small-Context-Based-Chat-Summary-Plus
|
14 |
results: []
|
|
|
|
|
|
|
|
|
|
|
15 |
---
|
16 |
|
17 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
18 |
should probably proofread and complete it, then remove this comment. -->
|
19 |
|
|
|
|
|
|
|
|
|
20 |
# mt5-small-Context-Based-Chat-Summary-Plus
|
21 |
|
22 |
-
This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on
|
23 |
-
It achieves the following results on the evaluation set:
|
24 |
-
- Loss: 1.7287
|
25 |
-
- Rouge1: 52.033
|
26 |
-
- Rouge2: 28.5069
|
27 |
-
- Rougel: 47.9951
|
28 |
-
- Rougelsum: 47.994
|
29 |
|
30 |
## Model description
|
31 |
|
32 |
-
|
33 |
|
34 |
## Intended uses & limitations
|
35 |
|
36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
|
38 |
## Training and evaluation data
|
39 |
|
40 |
-
|
|
|
|
|
|
|
|
|
41 |
|
42 |
## Training procedure
|
43 |
|
44 |
### Training hyperparameters
|
45 |
-
|
46 |
-
|
47 |
-
-
|
48 |
-
-
|
49 |
-
-
|
50 |
-
-
|
51 |
-
-
|
52 |
-
- lr_scheduler_type: linear
|
53 |
-
- num_epochs: 6
|
54 |
|
55 |
### Training results
|
56 |
|
@@ -62,10 +74,33 @@ The following hyperparameters were used during training:
|
|
62 |
| 2.1912 | 5.0 | 6920 | 1.7372 | 51.912 | 28.3549 | 47.8763 | 47.8849 |
|
63 |
| 2.1537 | 6.0 | 8304 | 1.7287 | 52.033 | 28.5069 | 47.9951 | 47.994 |
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
-
|
|
|
|
|
|
|
|
|
|
|
67 |
|
68 |
-
|
69 |
-
- Pytorch 2.5.1+cu121
|
70 |
-
- Datasets 3.2.0
|
71 |
-
- Tokenizers 0.21.0
|
|
|
12 |
model-index:
|
13 |
- name: mt5-small-Context-Based-Chat-Summary-Plus
|
14 |
results: []
|
15 |
+
datasets:
|
16 |
+
- prithivMLmods/Context-Based-Chat-Summary-Plus
|
17 |
+
language:
|
18 |
+
- en
|
19 |
+
pipeline_tag: summarization
|
20 |
---
|
21 |
|
22 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
23 |
should probably proofread and complete it, then remove this comment. -->
|
24 |
|
25 |
+
Here’s a modified version of your original model description to reflect the code you provided:
|
26 |
+
|
27 |
+
---
|
28 |
+
|
29 |
# mt5-small-Context-Based-Chat-Summary-Plus
|
30 |
|
31 |
+
This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the [prithivMLmods/Context-Based-Chat-Summary-Plus](https://huggingface.co/datasets/prithivMLmods/Context-Based-Chat-Summary-Plus) dataset. It performs well on context-based summarization tasks, leveraging the mT5 model's multilingual capabilities.
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
|
33 |
## Model description
|
34 |
|
35 |
+
This model is designed for summarizing context-based chat data. The model was trained to generate summaries based on conversations and text-based inputs. It uses a seq2seq architecture, fine-tuned to produce accurate and coherent summaries.
|
36 |
|
37 |
## Intended uses & limitations
|
38 |
|
39 |
+
### Intended Uses:
|
40 |
+
- Contextual text summarization
|
41 |
+
- Summarizing chat logs, meeting transcripts, or conversational exchanges
|
42 |
+
- Extracting key points or highlights from a larger body of text
|
43 |
+
|
44 |
+
### Limitations:
|
45 |
+
- May struggle with highly specialized or domain-specific language
|
46 |
+
- Could produce summaries that may require further refinement for nuanced or highly technical content
|
47 |
|
48 |
## Training and evaluation data
|
49 |
|
50 |
+
The model was trained on the [prithivMLmods/Context-Based-Chat-Summary-Plus](https://huggingface.co/datasets/prithivMLmods/Context-Based-Chat-Summary-Plus) dataset, which consists of conversational and text data, with summaries representing the key elements of the content.
|
51 |
+
|
52 |
+
### Data preprocessing:
|
53 |
+
- Filters were applied to exclude entries with short headlines (less than 3 words) or text with fewer than 30 words.
|
54 |
+
- The dataset was split into 90% training and 10% testing.
|
55 |
|
56 |
## Training procedure
|
57 |
|
58 |
### Training hyperparameters
|
59 |
+
- **Learning Rate**: 5.6e-5
|
60 |
+
- **Train Batch Size**: 64
|
61 |
+
- **Eval Batch Size**: 64
|
62 |
+
- **Epochs**: 6 (initially 4 epochs, followed by an additional 2 epochs)
|
63 |
+
- **Optimizer**: AdamW with betas=(0.9,0.999) and epsilon=1e-08
|
64 |
+
- **Scheduler**: Linear learning rate scheduler
|
65 |
+
- **Logging**: Logging steps were set to show every epoch.
|
|
|
|
|
66 |
|
67 |
### Training results
|
68 |
|
|
|
74 |
| 2.1912 | 5.0 | 6920 | 1.7372 | 51.912 | 28.3549 | 47.8763 | 47.8849 |
|
75 |
| 2.1537 | 6.0 | 8304 | 1.7287 | 52.033 | 28.5069 | 47.9951 | 47.994 |
|
76 |
|
77 |
+
### Framework versions:
|
78 |
+
- **Transformers**: 4.47.1
|
79 |
+
- **Pytorch**: 2.5.1+cu121
|
80 |
+
- **Datasets**: 3.2.0
|
81 |
+
- **Tokenizers**: 0.21.0
|
82 |
+
|
83 |
+
## Evaluation
|
84 |
+
|
85 |
+
The model was evaluated using the ROUGE metric, achieving the following scores on the validation set:
|
86 |
+
- **Rouge-1**: 52.033
|
87 |
+
- **Rouge-2**: 28.5069
|
88 |
+
- **Rouge-L**: 47.9951
|
89 |
+
- **Rouge-Lsum**: 47.994
|
90 |
+
|
91 |
+
## Final Results
|
92 |
+
After 6 epochs of training, the model was pushed to the Hugging Face Hub with the identifier **ParitKansal/mt5-small-Context-Based-Chat-Summary-Plus**. You can use it for summarization tasks directly.
|
93 |
+
|
94 |
+
### Example Usage:
|
95 |
+
|
96 |
+
```python
|
97 |
+
from transformers import pipeline
|
98 |
|
99 |
+
hub_model_id = "ParitKansal/mt5-small-Context-Based-Chat-Summary-Plus"
|
100 |
+
summarizer = pipeline("summarization", model=hub_model_id)
|
101 |
+
text = "Snehlata Shrivastava has been appointed as the Secretary General of the Lok Sabha, a notification issued by the Secretariat of the lower house said. She is the first woman to be elected for the post and will assume charge from December 1. She was earlier the Joint Secretary in the Law Ministry and has also worked in the Finance Ministry."
|
102 |
+
summary = summarizer(text)[0]['summary_text']
|
103 |
+
print("Predicted Summary: ", summary)
|
104 |
+
```
|
105 |
|
106 |
+
---
|
|
|
|
|
|