Recognai
/

selectra_medium

Transformers

PyTorch

Spanish

electra

pretraining

Inference Endpoints

Model card Files Files and versions Community

David commited on Oct 17, 2021

Commit

296e183

1 Parent(s): b6d9472

Update README.md

Browse files

Files changed (1) hide show

README.md +10 -8

README.md CHANGED Viewed

@@ -14,8 +14,8 @@ We release a `small` and `medium` version with the following configuration:
 | Model | Layers | Embedding/Hidden Size | Params | Vocab Size | Max Sequence Length | Cased |
 | --- | --- | --- | --- | ---  | --- | --- |
-| SELECTRA small | 12 | 256 | 22M | 50k | 512 | True |
-| **SELECTRA medium** | **12** | **384** | **41M** | **50k** | **512** | **True** |
 Selectra small (medium) is about 5 (3) times smaller than BETO but achieves comparable results (see Metrics section below).
@@ -27,8 +27,8 @@ The discriminator should therefore activate the logit corresponding to the fake
 ```python
 from transformers import ElectraForPreTraining, ElectraTokenizerFast
-discriminator = ElectraForPreTraining.from_pretrained("Recognai/selectra_medium")
-tokenizer = ElectraTokenizerFast.from_pretrained("Recognai/selectra_medium")
 sentence_with_fake_token = "Estamos desayunando pan rosa con tomate y aceite de oliva."
@@ -39,13 +39,15 @@ print("\t".join(tokenizer.tokenize(sentence_with_fake_token)))
 print("\t".join(map(lambda x: str(x)[:4], logits[1:-1])))
 """Output:
 Estamos desayun ##ando  pan     rosa    con     tomate  y       aceite  de      oliva   .
--2.2    -1.9    -6.4    -2.0    -0.6    -4.3    -3.2    -4.9    -5.5    -7.2    -4.5    -4.0
 """
 ```
-However, you probably want to use this model to fine-tune it on a down-stream task.
-- Links to our zero-shot-classifiers
 ## Metrics
@@ -59,7 +61,7 @@ We fine-tune our models on 4 different down-stream tasks:
 For each task, we conduct 5 trials and state the mean and standard deviation of the metrics in the table below.
 To compare our results to other Spanish language models, we provide the same metrics taken from [Table 4](https://huggingface.co/bertin-project/bertin-roberta-base-spanish#results) of the Bertin-project model card.
-| Model | CoNLL2002 - POS (acc) | CoNLL2002 - NER (f1) | PAWS-X (acc) | XNLI (acc) | Params |
 | --- | --- | --- | --- | --- | --- |
 | SELECTRA small | 0.9653 +- 0.0007 | 0.863 +- 0.004 | 0.896 +- 0.002 | 0.784 +- 0.002 | **22M** |
 | SELECTRA medium | 0.9677 +- 0.0004 | 0.870 +- 0.003 | 0.896 +- 0.002 | **0.804 +- 0.002** | 41M |

 | Model | Layers | Embedding/Hidden Size | Params | Vocab Size | Max Sequence Length | Cased |
 | --- | --- | --- | --- | ---  | --- | --- |
+| [SELECTRA small](https://huggingface.co/Recognai/selectra_small) | 12 | 256 | 22M | 50k | 512 | True |
+| **SELECTRA medium**] | **12** | **384** | **41M** | **50k** | **512** | **True** |
 Selectra small (medium) is about 5 (3) times smaller than BETO but achieves comparable results (see Metrics section below).
 ```python
 from transformers import ElectraForPreTraining, ElectraTokenizerFast
+discriminator = ElectraForPreTraining.from_pretrained("Recognai/selectra_small")
+tokenizer = ElectraTokenizerFast.from_pretrained("Recognai/selectra_small")
 sentence_with_fake_token = "Estamos desayunando pan rosa con tomate y aceite de oliva."
 print("\t".join(map(lambda x: str(x)[:4], logits[1:-1])))
 """Output:
 Estamos desayun ##ando  pan     rosa    con     tomate  y       aceite  de      oliva   .
+-3.1    -3.6    -6.9    -3.0    0.19    -4.5    -3.3    -5.1    -5.7    -7.7    -4.4    -4.2
 """
 ```
+However, you probably want to use this model to fine-tune it on a downstream task.
+We provide models fine-tuned on the [XNLI dataset](https://huggingface.co/datasets/xnli), which can be used together with the zero-shot classification pipeline:
+- [Zero-shot SELECTRA small](https://huggingface.co/Recognai/zeroshot_selectra_small)
+- [Zero-shot SELECTRA medium](https://huggingface.co/Recognai/zeroshot_selectra_medium)
 ## Metrics
 For each task, we conduct 5 trials and state the mean and standard deviation of the metrics in the table below.
 To compare our results to other Spanish language models, we provide the same metrics taken from [Table 4](https://huggingface.co/bertin-project/bertin-roberta-base-spanish#results) of the Bertin-project model card.
+| Model | [CoNLL2002](https://huggingface.co/datasets/conll2002) - POS (acc) | [CoNLL2002](https://huggingface.co/datasets/conll2002) - NER (f1) | [PAWS-X](https://huggingface.co/datasets/paws-x) (acc) | [XNLI](https://huggingface.co/datasets/xnli) (acc) | Params |
 | --- | --- | --- | --- | --- | --- |
 | SELECTRA small | 0.9653 +- 0.0007 | 0.863 +- 0.004 | 0.896 +- 0.002 | 0.784 +- 0.002 | **22M** |
 | SELECTRA medium | 0.9677 +- 0.0004 | 0.870 +- 0.003 | 0.896 +- 0.002 | **0.804 +- 0.002** | 41M |