File size: 4,013 Bytes
dcacfc7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c426394
 
 
 
 
 
dcacfc7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c426394
dcacfc7
 
 
daab84c
 
c426394
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dcacfc7
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
model-index:
- name: long_t5_test
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# long_t5_test

This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.3506
- Rouge Rouge1: 0.4697
- Rouge Rouge2: 0.1989
- Rouge Rougel: 0.274
- Rouge Rougelsum: 0.2736
- Gen Len: 388.0152

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge Rouge1 | Rouge Rouge2 | Rouge Rougel | Rouge Rougelsum | Gen Len  |
|:-------------:|:-----:|:----:|:---------------:|:------------:|:------------:|:------------:|:---------------:|:--------:|
| No log        | 1.0   | 394  | 1.9389          | 0.0284       | 0.0089       | 0.0167       | 0.0165          | 30.2273  |
| 3.6937        | 2.0   | 788  | 1.4702          | 0.4261       | 0.1598       | 0.254        | 0.2539          | 399.0    |
| 1.8772        | 3.0   | 1182 | 1.4362          | 0.4397       | 0.1699       | 0.2592       | 0.2591          | 398.5152 |
| 1.7418        | 4.0   | 1576 | 1.4204          | 0.4434       | 0.1779       | 0.2627       | 0.2628          | 397.7374 |
| 1.7418        | 5.0   | 1970 | 1.4108          | 0.4474       | 0.181        | 0.2631       | 0.263           | 394.798  |
| 1.6623        | 6.0   | 2364 | 1.3932          | 0.4546       | 0.1873       | 0.2675       | 0.2673          | 391.8586 |
| 1.6449        | 7.0   | 2758 | 1.3872          | 0.4559       | 0.1882       | 0.2665       | 0.2664          | 393.4848 |
| 1.5757        | 8.0   | 3152 | 1.3814          | 0.458        | 0.1906       | 0.2692       | 0.2692          | 397.1061 |
| 1.5527        | 9.0   | 3546 | 1.3718          | 0.4607       | 0.1912       | 0.2705       | 0.2706          | 391.7222 |
| 1.5527        | 10.0  | 3940 | 1.3703          | 0.4649       | 0.194        | 0.2717       | 0.2719          | 393.8788 |
| 1.5302        | 11.0  | 4334 | 1.3621          | 0.4664       | 0.197        | 0.2726       | 0.2724          | 386.2071 |
| 1.5142        | 12.0  | 4728 | 1.3537          | 0.4694       | 0.1977       | 0.2731       | 0.2731          | 388.9798 |
| 1.4721        | 13.0  | 5122 | 1.3528          | 0.4652       | 0.1961       | 0.2716       | 0.2714          | 390.2828 |
| 1.4745        | 14.0  | 5516 | 1.3550          | 0.4708       | 0.2009       | 0.2742       | 0.2739          | 393.8131 |
| 1.4745        | 15.0  | 5910 | 1.3500          | 0.471        | 0.199        | 0.2742       | 0.2741          | 385.4192 |
| 1.4799        | 16.0  | 6304 | 1.3505          | 0.4725       | 0.2008       | 0.2764       | 0.2761          | 387.6364 |
| 1.4558        | 17.0  | 6698 | 1.3535          | 0.4743       | 0.2032       | 0.2765       | 0.2764          | 389.4192 |
| 1.4426        | 18.0  | 7092 | 1.3494          | 0.4743       | 0.2042       | 0.278        | 0.2776          | 386.4394 |
| 1.4426        | 19.0  | 7486 | 1.3513          | 0.4719       | 0.2019       | 0.2753       | 0.2752          | 388.6515 |
| 1.4411        | 20.0  | 7880 | 1.3506          | 0.4697       | 0.1989       | 0.274        | 0.2736          | 388.0152 |


### Framework versions

- Transformers 4.37.2
- Pytorch 2.1.1+cu121
- Datasets 3.0.1
- Tokenizers 0.15.1