iproskurina
/

tda-ruroberta-large-ru-cola

Text Classification

Inference Endpoints

Model card Files Files and versions Community

iproskurina commited on May 29, 2023

Commit

aa9aaa8

·

1 Parent(s): c185800

Create README.md

Files changed (1) hide show

README.md +61 -0

README.md ADDED Viewed

	@@ -0,0 +1,61 @@

+---
+license: apache-2.0
+tags:
+- TDA
+metrics:
+- accuracy
+- matthews_correlation
+model-index:
+- name: ruRoberta-large-ru-cola_32_1e-05_lr_0.0001_decay_balanced_freeze
+  results: []
+datasets:
+- RussianNLP/rucola
+language:
+- ru
+---
+[**Official repository**](https://github.com/upunaprosk/la-tda)
+# RuRoBERTa-large-TDA
+This model is a version of [sberbank-ai/ruRoberta-large](https://huggingface.co/sberbank-ai/ruRoberta-large) fine-tuned on [RuCoLA](https://huggingface.co/datasets/RussianNLP/rucola).
+It achieves the following results on the evaluation set:
+- Accuracy: 0.835
+- Mcc: 0.530
+## Features extracted from Transformer
+The features extracted from attention maps include the following:
+1. **Topological features** are properties of attention graphs. Features of directed attention graphs include the number of strongly connected components, edges, simple cycles and average vertex degree. The properties of undirected graphs include
+the first two Betti numbers: the number of connected components and the number of simple cycles, the matching number and the chordality.
+2. **Features derived from barcodes** include descriptive characteristics of 0/1-dimensional barcodes and reflect the survival (death and birth) of
+connected components and edges throughout the filtration.
+3. **Distance-to-pattern** features measure the distance between attention matrices and identity matrices of pre-defined attention patterns, such as attention to the first token [CLS] and to the last
+[SEP] of the sequence, attention to previous and
+next token and to punctuation marks.
+The **computed features and barcodes** can be found in the subdirectories of the repository. *test_sub*  features and barcodes were computed on the out-of-domain test RuCoLA dataset.
+Refer to notebooks 4* and 5* from the [repository](https://github.com/upunaprosk/la-tda) to construct the classification pipeline with TDA features.
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 32
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 5.0
+### Framework versions
+- Transformers 4.27.0.dev0
+- Pytorch 1.13.1+cu116
+- Datasets 2.9.0
+- Tokenizers 0.13.2