akshay9125 commited on
Commit
1e072cf
·
verified ·
1 Parent(s): 3c131f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -18
README.md CHANGED
@@ -1,10 +1,10 @@
1
  ---
2
  library_name: transformers
3
- tags: []
4
  ---
5
 
6
  # Model Card for Model ID
7
-
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
 
@@ -14,22 +14,21 @@ tags: []
14
  ### Model Description
15
 
16
  <!-- Provide a longer summary of what this model is. -->
 
17
 
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
-
20
- - **Developed by:** [More Information Needed]
21
  - **Funded by [optional]:** [More Information Needed]
22
  - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
  - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
  ### Model Sources [optional]
29
 
30
  <!-- Provide the basic links for the model. -->
31
 
32
- - **Repository:** [More Information Needed]
33
  - **Paper [optional]:** [More Information Needed]
34
  - **Demo [optional]:** [More Information Needed]
35
 
@@ -41,37 +40,68 @@ This is the model card of a 🤗 transformers model that has been pushed on the
41
 
42
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
 
44
- [More Information Needed]
 
 
 
 
 
 
45
 
46
  ### Downstream Use [optional]
47
 
48
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
 
50
- [More Information Needed]
 
 
 
 
51
 
52
  ### Out-of-Scope Use
53
 
54
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
 
56
- [More Information Needed]
 
 
 
 
57
 
58
  ## Bias, Risks, and Limitations
59
 
60
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 
61
 
62
- [More Information Needed]
 
 
 
 
63
 
64
  ### Recommendations
65
 
66
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 
67
 
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
 
 
 
 
70
  ## How to Get Started with the Model
71
 
72
  Use the code below to get started with the model.
73
 
74
- [More Information Needed]
 
 
 
 
 
 
 
 
75
 
76
  ## Training Details
77
 
@@ -79,11 +109,16 @@ Use the code below to get started with the model.
79
 
80
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
 
82
- [More Information Needed]
 
 
83
 
84
  ### Training Procedure
85
 
86
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
 
 
87
 
88
  #### Preprocessing [optional]
89
 
@@ -98,7 +133,7 @@ Use the code below to get started with the model.
98
 
99
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
 
101
- [More Information Needed]
102
 
103
  ## Evaluation
104
 
@@ -192,7 +227,7 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
192
 
193
  ## Model Card Authors [optional]
194
 
195
- [More Information Needed]
196
 
197
  ## Model Card Contact
198
 
 
1
  ---
2
  library_name: transformers
3
+ tags: [text-summarization, pegasus, fine-tuned, NLP]
4
  ---
5
 
6
  # Model Card for Model ID
7
+ Model Card for Fine-Tuned Pegasus Summary Generator
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
 
 
14
  ### Model Description
15
 
16
  <!-- Provide a longer summary of what this model is. -->
17
+ This model is a fine-tuned version of the Pegasus model for text summarization, specifically optimized for generating structured summaries from transcripts. The model has been trained to capture key points, remove redundant information, and maintain coherence in summaries.
18
 
19
+ - **Developed by:** Akshay Choudhary
 
 
20
  - **Funded by [optional]:** [More Information Needed]
21
  - **Shared by [optional]:** [More Information Needed]
22
+ - **Model type:** Transformer-based summarization model
23
+ - **Language(s) (NLP):** English
24
  - **License:** [More Information Needed]
25
+ - **Finetuned from model [optional]:** google/pegasus-large
26
 
27
  ### Model Sources [optional]
28
 
29
  <!-- Provide the basic links for the model. -->
30
 
31
+ - **Repository:** https://huggingface.co/akshay9125/Transcript_Summerizer/
32
  - **Paper [optional]:** [More Information Needed]
33
  - **Demo [optional]:** [More Information Needed]
34
 
 
40
 
41
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
42
 
43
+ The model can be directly used for transcript summarization in various applications, including:
44
+
45
+ * Meeting and lecture transcript summarization
46
+
47
+ * Podcast and interview summarization
48
+
49
+ * Summarization of long-form text data
50
 
51
  ### Downstream Use [optional]
52
 
53
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
54
 
55
+ he model can be fine-tuned further for:
56
+
57
+ * Domain-specific summarization (e.g., medical, legal, educational transcripts)
58
+
59
+ * Integration into AI-powered note-taking tool
60
 
61
  ### Out-of-Scope Use
62
 
63
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
64
 
65
+ * Generating highly creative or fictional content
66
+
67
+ * Summarizing extremely noisy or low-quality transcripts
68
+
69
+ * Generating precise legal or medical documentation without expert verification
70
 
71
  ## Bias, Risks, and Limitations
72
 
73
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
74
+ The model may exhibit biases based on:
75
 
76
+ T* he dataset used for fine-tuning
77
+
78
+ * The quality and clarity of input transcripts
79
+
80
+ * Potential loss of nuanced context in summarization
81
 
82
  ### Recommendations
83
 
84
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
85
+ Users should:
86
 
87
+ * Validate summaries for critical use cases
88
 
89
+ * Avoid using the model for tasks requiring absolute accuracy without human verification
90
+
91
+ * Be aware of potential biases in summarization
92
  ## How to Get Started with the Model
93
 
94
  Use the code below to get started with the model.
95
 
96
+ from transformers import PegasusForConditionalGeneration, PegasusTokenizer
97
+
98
+ tokenizer = PegasusTokenizer.from_pretrained("akshay9125/Transcript_Summerizer")
99
+ model = PegasusForConditionalGeneration.from_pretrained("akshay9125/Transcript_Summerizer")
100
+
101
+ def summarize_text(text):
102
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding="longest")
103
+ summary_ids = model.generate(**inputs)
104
+ return tokenizer.decode(summary_ids[0], skip_special_tokens=True)
105
 
106
  ## Training Details
107
 
 
109
 
110
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
111
 
112
+ * Dataset: Collected and preprocessed transcript datasets
113
+
114
+ * Preprocessing: Removal of noise, speaker labels, and unnecessary pauses
115
 
116
  ### Training Procedure
117
 
118
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
119
+ * Preprocessing: Tokenization with Pegasus tokenizer
120
+
121
+ * Training regime: FP16 mixed precision
122
 
123
  #### Preprocessing [optional]
124
 
 
133
 
134
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
135
 
136
+ * Model size: ~568M parameters
137
 
138
  ## Evaluation
139
 
 
227
 
228
  ## Model Card Authors [optional]
229
 
230
+ **Akshay Choudhary**
231
 
232
  ## Model Card Contact
233