iproskurina commited on
Commit
aa9aaa8
·
1 Parent(s): c185800

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - TDA
5
+ metrics:
6
+ - accuracy
7
+ - matthews_correlation
8
+ model-index:
9
+ - name: ruRoberta-large-ru-cola_32_1e-05_lr_0.0001_decay_balanced_freeze
10
+ results: []
11
+ datasets:
12
+ - RussianNLP/rucola
13
+ language:
14
+ - ru
15
+ ---
16
+
17
+ [**Official repository**](https://github.com/upunaprosk/la-tda)
18
+
19
+ # RuRoBERTa-large-TDA
20
+
21
+ This model is a version of [sberbank-ai/ruRoberta-large](https://huggingface.co/sberbank-ai/ruRoberta-large) fine-tuned on [RuCoLA](https://huggingface.co/datasets/RussianNLP/rucola).
22
+ It achieves the following results on the evaluation set:
23
+ - Accuracy: 0.835
24
+ - Mcc: 0.530
25
+
26
+ ## Features extracted from Transformer
27
+
28
+ The features extracted from attention maps include the following:
29
+
30
+ 1. **Topological features** are properties of attention graphs. Features of directed attention graphs include the number of strongly connected components, edges, simple cycles and average vertex degree. The properties of undirected graphs include
31
+ the first two Betti numbers: the number of connected components and the number of simple cycles, the matching number and the chordality.
32
+
33
+ 2. **Features derived from barcodes** include descriptive characteristics of 0/1-dimensional barcodes and reflect the survival (death and birth) of
34
+ connected components and edges throughout the filtration.
35
+
36
+ 3. **Distance-to-pattern** features measure the distance between attention matrices and identity matrices of pre-defined attention patterns, such as attention to the first token [CLS] and to the last
37
+ [SEP] of the sequence, attention to previous and
38
+ next token and to punctuation marks.
39
+
40
+ The **computed features and barcodes** can be found in the subdirectories of the repository. *test_sub* features and barcodes were computed on the out-of-domain test RuCoLA dataset.
41
+ Refer to notebooks 4* and 5* from the [repository](https://github.com/upunaprosk/la-tda) to construct the classification pipeline with TDA features.
42
+
43
+ ## Training procedure
44
+
45
+ ### Training hyperparameters
46
+
47
+ The following hyperparameters were used during training:
48
+ - learning_rate: 1e-05
49
+ - train_batch_size: 32
50
+ - eval_batch_size: 8
51
+ - seed: 42
52
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
53
+ - lr_scheduler_type: linear
54
+ - num_epochs: 5.0
55
+
56
+ ### Framework versions
57
+
58
+ - Transformers 4.27.0.dev0
59
+ - Pytorch 1.13.1+cu116
60
+ - Datasets 2.9.0
61
+ - Tokenizers 0.13.2