File size: 4,075 Bytes
465930d
 
 
 
 
eefbc15
465930d
 
 
57850ca
465930d
 
 
 
 
 
f2a3701
465930d
 
 
f2a3701
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
465930d
 
 
73c4864
 
465930d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7691c44
3f6ec4f
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
license: gpl-3.0
language:
- en
widget:
- text: 'Sentence: "Lee was making his final Canadian convention appearance, part of a larger farewell tour." The distribution of auxiliary verbs in the sentence is equal to '
  example_title: "Example 1"
---

# LiT5 Small

<p align="center">
    <img src="lit5.png" alt="Linguistically-Informed T5" width="500"/>
</p>


This model is released as part of the paper ["Linguistic Knowledge Can Enhance Encoder-Decoder Models (*If You Let It*)"](https://aclanthology.org/2024.lrec-main.922.pdf) (Miaschi et al., 2024). 
If you use this model in your work, we kindly ask you to cite our paper:

```bibtex
@inproceedings{miaschi-etal-2024-linguistic-knowledge,
    title = "Linguistic Knowledge Can Enhance Encoder-Decoder Models (If You Let It)",
    author = "Miaschi, Alessio  and
      Dell{'}Orletta, Felice  and
      Venturi, Giulia",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italy",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.922",
    pages = "10539--10554",
    abstract = "In this paper, we explore the impact of augmenting pre-trained Encoder-Decoder models, specifically T5, with linguistic knowledge for the prediction of a target task. In particular, we investigate whether fine-tuning a T5 model on an intermediate task that predicts structural linguistic properties of sentences modifies its performance in the target task of predicting sentence-level complexity. Our study encompasses diverse experiments conducted on Italian and English datasets, employing both monolingual and multilingual T5 models at various sizes. Results obtained for both languages and in cross-lingual configurations show that linguistically motivated intermediate fine-tuning has generally a positive impact on target task performance, especially when applied to smaller models and in scenarios with limited data availability.",
}
```

> **Abstract:** In this paper, we explore the impact of augmenting pre-trained Encoder-Decoder models, specifically T5, with linguistic knowledge for the prediction of a target task. In particular, we investigate whether fine-tuning a T5 model on an intermediate task that predicts structural linguistic properties of sentences modifies its performance in the target task of predicting sentence-level complexity. Our study encompasses diverse experiments conducted on Italian and English datasets, employing both monolingual and multilingual T5 models at various sizes. Results obtained for both languages and in cross-lingual configurations show that linguistically motivated intermediate fine-tuning has generally a positive impact on target task performance, especially when applied to smaller models and in scenarios with limited data availability.

Other information can be found in the original [GitHub repository](https://github.com/alemiaschi/linguistically_informed_t5/tree/main).

## Model Description

The model is based on a T5 model fine-tuned in a multitask fashion to solve a set of raw, morpho-syntactic and syntactic tasks (i.e. predictions of linguistic properties). 
The full list of the 10 linguistic properties used as intermediate tasks can be found in the original paper.

This model is based on the English version of t5-small, [t5-small](https://huggingface.co/google-t5/t5-small).

## Model variations

The other fine-tuned models presented in the original study are the following:

- [li-it5-small](https://huggingface.co/alemiaschi/li-it5-small)
- [li-it5-base](https://huggingface.co/alemiaschi/li-it5-base)
- [li-it5-large](https://huggingface.co/alemiaschi/li-it5-large)
- [lit5-base](https://huggingface.co/alemiaschi/lit5-base)
- [lit5-large](https://huggingface.co/alemiaschi/lit5-large)