junnyu commited on
Commit
c54ef05
2 Parent(s): 0068926 75bc328

Merge branch 'main' of https://huggingface.co/junnyu/roformer_small_discriminator

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ thumbnail: "https://github.com/junnyu"
4
+ tags:
5
+ - pytorch
6
+ - electra
7
+ - roformer
8
+ - rotary position embedding
9
+ license: "MIT"
10
+ datasets:
11
+ - openwebtext
12
+
13
+ ---
14
+ # 一、 个人在openwebtext数据集上添加rotary-position-embedding,训练得到的electra-small模型
15
+
16
+ # 二、 复现结果(dev dataset)
17
+ |Model|CoLA|SST|MRPC|STS|QQP|MNLI|QNLI|RTE|Avg.|
18
+ |---|---|---|---|---|---|---|---|---|---|
19
+ |ELECTRA-Small-OWT(original)|56.8|88.3|87.4|86.8|88.3|78.9|87.9|68.5|80.36|
20
+ |**ELECTRA-Small-OWT (this)**| xx|xx|xx|xx|xx|xx|xx|xx|xx|
21
+
22
+ # 三、 训练细节
23
+ - 数据集 openwebtext
24
+ - 训练batch_size 256
25
+ - 学习率lr 5e-4
26
+ - 最大句子长度max_seqlen 128
27
+ - 训练total step 50W
28
+ - GPU RTX3090
29
+ - 训练时间总共耗费55h
30
+
31
+ ### wandb日志
32
+ https://wandb.ai/junyu/electra_rotary_small_pretrain?workspace=user-junyu
33
+
34
+ # 四、安装
35
+ ```bash
36
+ pip install roformer
37
+
38
+ pip install git+https://github.com/JunnYu/RoFormer_pytorch.git
39
+ ```
40
+
41
+ # 五、 使用
42
+ ```python
43
+ import torch
44
+ from roformer import RoFormerModel
45
+ from transformers import ElectraTokenizer
46
+ tokenizer = ElectraTokenizer.from_pretrained("junnyu/roformer_small_discriminator")
47
+ model = RoFormerModel.from_pretrained("junnyu/roformer_small_discriminator")
48
+ inputs = tokenizer("Beijing is the capital of China.", return_tensors="pt")
49
+ with torch.no_grad():
50
+ outputs = model(**inputs)
51
+ print(outputs[0].shape)
52
+ ```