manjunathainti commited on
Commit
a9558cd
·
verified ·
1 Parent(s): dc278e4

Traning details push

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md CHANGED
@@ -62,6 +62,73 @@ The model may reflect biases present in the training data, such as jurisdictiona
62
  - Outputs should always be reviewed by a legal expert.
63
  - Avoid using for legal tasks where complete precision is mandatory.
64
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  ## How to Get Started with the Model
66
 
67
  ```python
 
62
  - Outputs should always be reviewed by a legal expert.
63
  - Avoid using for legal tasks where complete precision is mandatory.
64
 
65
+
66
+ ### Training Data
67
+ - **Dataset:** Multi-LexSum
68
+ - **Preprocessing:** Preprocessed for summarization tasks using tokenization.
69
+
70
+ ### Training Procedure
71
+
72
+ #### Preprocessing
73
+ - Tokenization and truncation were applied to the dataset.
74
+ - Input sequences were capped at 1024 tokens.
75
+ - Summaries were limited to:
76
+ - 150 tokens for short summaries.
77
+ - 300 tokens for long summaries.
78
+
79
+ #### Training Hyperparameters
80
+ - **Learning Rate:** 5e-5
81
+ - **Batch Size:** 1 (gradient accumulation steps: 8)
82
+ - **Epochs:** 3
83
+ - **Optimizer:** AdamW
84
+ - **Precision:** Mixed (fp16)
85
+
86
+ #### Speeds, Sizes, Times
87
+ - **Training Time:** ~4 hours
88
+ - **Checkpoint Size:** ~892 MB
89
+ - **Hardware:** NVIDIA Tesla V100
90
+
91
+ ## Evaluation
92
+
93
+ ### Testing Data, Factors & Metrics
94
+
95
+ #### Testing Data
96
+ - Validation was performed on the `validation` split of the Multi-LexSum dataset, consisting of 4,818 examples.
97
+
98
+ #### Metrics
99
+ - **ROUGE-1:** 0.49
100
+ - **ROUGE-2:** 0.35
101
+ - **ROUGE-L:** 0.49
102
+
103
+ ### Results
104
+ - The model produces reliable short and long summaries for legal documents, maintaining coherence and relevance.
105
+
106
+ #### Summary
107
+ - The fine-tuned T5 model demonstrated robust performance in summarizing legal documents, achieving competitive ROUGE scores.
108
+
109
+ ## Model Examination
110
+
111
+ ### Interpretability
112
+ - The model generates human-readable summaries, making it highly interpretable for end-users in the legal domain.
113
+
114
+ ## Environmental Impact
115
+ - **Carbon emissions** can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
116
+ - **Hardware Type:** NVIDIA Tesla V100
117
+ - **Hours Used:** ~4 hours
118
+ - **Cloud Provider:** Google Colab
119
+ - **Compute Region:** US
120
+ - **Estimated Carbon Emissions:** Minimal due to short training time.
121
+
122
+ ## Technical Specifications
123
+
124
+ ### Model Architecture and Objective
125
+ - The T5 architecture is designed for text-to-text tasks.
126
+ - This fine-tuned model adapts T5 for legal text summarization, leveraging the flexibility of seq2seq learning.
127
+
128
+ ### Compute Infrastructure
129
+ - **Hardware:** NVIDIA Tesla V100
130
+ - **Software:** Hugging Face Transformers 4.46.3, PyTorch
131
+
132
  ## How to Get Started with the Model
133
 
134
  ```python