NiGuLa commited on
Commit
2e1e225
·
1 Parent(s): f4c7a7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -4
README.md CHANGED
@@ -9,12 +9,17 @@ licenses:
9
  - cc-by-nc-sa
10
  ---
11
 
12
- Mutual implication score: a symmetric measure of text semantic similarity
 
 
13
  based on a RoBERTA model pretrained for natural language inference
14
- and fine-tuned for paraphrase detection.
15
 
 
16
  The following snippet illustrates code usage:
17
  ```python
 
 
18
  from mutual_implication_score import MIS
19
  mis = MIS(device='cpu')
20
  source_texts = ['I want to leave this room',
@@ -26,8 +31,11 @@ print(scores)
26
  # expected output: [0.9748, 0.0545]
27
  ```
28
 
29
- The first two texts are semantically equivalent, their MIS is close to 1.
30
- The two other texts have different meanings, and their score is low.
 
 
 
31
 
32
  If you find this repository helpful, feel free to cite our publication:
33
 
@@ -48,3 +56,13 @@ If you find this repository helpful, feel free to cite our publication:
48
  abstract = "Text style transfer and paraphrasing of texts are actively growing areas of NLP, dozens of methods for solving these tasks have been recently introduced. In both tasks, the system is supposed to generate a text which should be semantically similar to the input text. Therefore, these tasks are dependent on methods of measuring textual semantic similarity. However, it is still unclear which measures are the best to automatically evaluate content preservation between original and generated text. According to our observations, many researchers still use BLEU-like measures, while there exist more advanced measures including neural-based that significantly outperform classic approaches. The current problem is the lack of a thorough evaluation of the available measures. We close this gap by conducting a large-scale computational study by comparing 57 measures based on different principles on 19 annotated datasets. We show that measures based on cross-encoder models outperform alternative approaches in almost all cases.We also introduce the Mutual Implication Score (MIS), a measure that uses the idea of paraphrasing as a bidirectional entailment and outperforms all other measures on the paraphrase detection task and performs on par with the best measures in the text style transfer task.",
49
  }
50
  ```
 
 
 
 
 
 
 
 
 
 
 
9
  - cc-by-nc-sa
10
  ---
11
 
12
+ ## Model overview
13
+
14
+ Mutual Implication Score: a symmetric measure of text semantic similarity
15
  based on a RoBERTA model pretrained for natural language inference
16
+ and fine-tuned for paraphrase detection. It is particularly useful for paraphrases detection
17
 
18
+ ## How to use
19
  The following snippet illustrates code usage:
20
  ```python
21
+ !pip install mutual-implication-score
22
+
23
  from mutual_implication_score import MIS
24
  mis = MIS(device='cpu')
25
  source_texts = ['I want to leave this room',
 
31
  # expected output: [0.9748, 0.0545]
32
  ```
33
 
34
+ ## Model details
35
+
36
+ We slightly modify [RoBERTa-Large NLI](https://huggingface.co/ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli) model architectures (see the scheme below) and fin-tune it with [QQP](https://www.kaggle.com/c/quora-question-pairs) paraphrases dataset.
37
+
38
+ ![MIS](https://huggingface.co/SkolkovoInstitute/Mutual_Implication_Score/blob/main/MIS.pdf)
39
 
40
  If you find this repository helpful, feel free to cite our publication:
41
 
 
56
  abstract = "Text style transfer and paraphrasing of texts are actively growing areas of NLP, dozens of methods for solving these tasks have been recently introduced. In both tasks, the system is supposed to generate a text which should be semantically similar to the input text. Therefore, these tasks are dependent on methods of measuring textual semantic similarity. However, it is still unclear which measures are the best to automatically evaluate content preservation between original and generated text. According to our observations, many researchers still use BLEU-like measures, while there exist more advanced measures including neural-based that significantly outperform classic approaches. The current problem is the lack of a thorough evaluation of the available measures. We close this gap by conducting a large-scale computational study by comparing 57 measures based on different principles on 19 annotated datasets. We show that measures based on cross-encoder models outperform alternative approaches in almost all cases.We also introduce the Mutual Implication Score (MIS), a measure that uses the idea of paraphrasing as a bidirectional entailment and outperforms all other measures on the paraphrase detection task and performs on par with the best measures in the text style transfer task.",
57
  }
58
  ```
59
+
60
+
61
+ ## Licensing Information
62
+
63
+ [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].
64
+
65
+ [![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]
66
+
67
+ [cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
68
+ [cc-by-nc-sa-image]: https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png