elisanchez-beep commited on
Commit
eebf9a7
1 Parent(s): a4b98bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md CHANGED
@@ -1,3 +1,84 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ metrics:
6
+ - f1
7
+ pipeline_tag: token-classification
8
  ---
9
+
10
+ # XLM-RoBERTa for English Metaphor Detection
11
+
12
+ This model is a fine-tuned version of XLM-RoBERTa-large on VUAM dataset, for metaphor detection in English at token level. This model is presented in our paper [Leveraging a New Spanish Corpus for Multilingual and Cross-lingual Metaphor Detection](https://aclanthology.org/2022.conll-1.16/)
13
+
14
+
15
+ ### Model Sources
16
+
17
+ <!-- Provide the basic links for the model. -->
18
+
19
+ - **Repository:** (https://github.com/ixa-ehu/cometa)
20
+ - **Paper:** [Leveraging a New Spanish Corpus for Multilingual and Cross-lingual Metaphor Detection](https://aclanthology.org/2022.conll-1.16/)
21
+
22
+
23
+ ### Training & Testing Data
24
+ VUAM dataset (Steen et al. 2010).
25
+
26
+ #### Training Hyperparameters
27
+
28
+ - Batch size: 32
29
+ - Weight Decay: 0.1
30
+ - Learning Rate: 0.00003
31
+ - Epochs: 4
32
+
33
+ ### Results
34
+
35
+ - F1: 72.11
36
+ - Precision: 73.95
37
+ - Recall: 70.37
38
+
39
+
40
+ ## Label Dictionary
41
+
42
+ ```
43
+ {
44
+ "LABEL_0": "B-METAPHOR",
45
+ "LABEL_1": "I-METAPHOR",
46
+ "LABEL_2": "O"
47
+ }
48
+ ```
49
+
50
+ ## Citation
51
+
52
+ If you use this model, please cite our work:
53
+
54
+ ```
55
+
56
+ @book{steen2010method,
57
+ title={A method for linguistic metaphor identification: From MIP to MIPVU},
58
+ author={Steen, Gerard and Dorst, Lettie and Herrmann, J. and Kaal, Anna and Krennmayr, Tina and Pasma, Trijntje},
59
+ volume={14},
60
+ year={2010},
61
+ publisher={John Benjamins Publishing}
62
+ }
63
+
64
+ @inproceedings{sanchez-bayona-agerri-2022-leveraging,
65
+ title = "Leveraging a New {S}panish Corpus for Multilingual and Cross-lingual Metaphor Detection",
66
+ author = "Sanchez-Bayona, Elisa and
67
+ Agerri, Rodrigo",
68
+ editor = "Fokkens, Antske and
69
+ Srikumar, Vivek",
70
+ booktitle = "Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL)",
71
+ month = dec,
72
+ year = "2022",
73
+ address = "Abu Dhabi, United Arab Emirates (Hybrid)",
74
+ publisher = "Association for Computational Linguistics",
75
+ url = "https://aclanthology.org/2022.conll-1.16",
76
+ doi = "10.18653/v1/2022.conll-1.16",
77
+ pages = "228--240",
78
+ abstract = "The lack of wide coverage datasets annotated with everyday metaphorical expressions for languages other than English is striking. This means that most research on supervised metaphor detection has been published only for that language. In order to address this issue, this work presents the first corpus annotated with naturally occurring metaphors in Spanish large enough to develop systems to perform metaphor detection. The presented dataset, CoMeta, includes texts from various domains, namely, news, political discourse, Wikipedia and reviews. In order to label CoMeta, we apply the MIPVU method, the guidelines most commonly used to systematically annotate metaphor on real data. We use our newly created dataset to provide competitive baselines by fine-tuning several multilingual and monolingual state-of-the-art large language models. Furthermore, by leveraging the existing VUAM English data in addition to CoMeta, we present the, to the best of our knowledge, first cross-lingual experiments on supervised metaphor detection. Finally, we perform a detailed error analysis that explores the seemingly high transfer of everyday metaphor across these two languages and datasets.",
79
+ }
80
+ ```
81
+
82
+ ## Dataset Card Contact
83
+
84
+ {elisa.sanchez, rodrigo.agerri}@ehu.eus