Upload README.md with huggingface_hub

#1
by lbourdois - opened
Files changed (1) hide show
  1. README.md +1 -76
README.md CHANGED
@@ -1,78 +1,3 @@
1
  ---
2
- tags:
3
- - summarization
4
- widget:
5
- - text: "parse the uses licence node of this package , if any , and returns the license definition if theres"
6
-
7
  ---
8
-
9
-
10
- # CodeTrans model for api recommendation generation
11
- Pretrained model for api recommendation generation using the t5 small model architecture. It was first released in
12
- [this repository](https://github.com/agemagician/CodeTrans).
13
-
14
-
15
- ## Model description
16
-
17
- This CodeTrans model is based on the `t5-small` model. It has its own SentencePiece vocabulary model. It used multi-task training on 13 supervised tasks in the software development domain and 7 unsupervised datasets.
18
-
19
- ## Intended uses & limitations
20
-
21
- The model could be used to generate api usage for the java programming tasks.
22
-
23
- ### How to use
24
-
25
- Here is how to use this model to generate java function documentation using Transformers SummarizationPipeline:
26
-
27
- ```python
28
- from transformers import AutoTokenizer, AutoModelWithLMHead, SummarizationPipeline
29
-
30
- pipeline = SummarizationPipeline(
31
- model=AutoModelWithLMHead.from_pretrained("SEBIS/code_trans_t5_small_api_generation_multitask"),
32
- tokenizer=AutoTokenizer.from_pretrained("SEBIS/code_trans_t5_small_api_generation_multitask", skip_special_tokens=True),
33
- device=0
34
- )
35
-
36
- tokenized_code = "parse the uses licence node of this package , if any , and returns the license definition if theres"
37
- pipeline([tokenized_code])
38
- ```
39
- Run this example in [colab notebook](https://github.com/agemagician/CodeTrans/blob/main/prediction/multitask/pre-training/api%20generation/small_model.ipynb).
40
- ## Training data
41
-
42
- The supervised training tasks datasets can be downloaded on [Link](https://www.dropbox.com/sh/488bq2of10r4wvw/AACs5CGIQuwtsD7j_Ls_JAORa/finetuning_dataset?dl=0&subfolder_nav_tracking=1)
43
-
44
-
45
- ## Training procedure
46
-
47
- ### Multi-task Pretraining
48
-
49
- The model was trained on a single TPU Pod V3-8 for 500,000 steps in total, using sequence length 512 (batch size 4096).
50
- It has a total of approximately 220M parameters and was trained using the encoder-decoder architecture.
51
- The optimizer used is AdaFactor with inverse square root learning rate schedule for pre-training.
52
-
53
-
54
- ## Evaluation results
55
-
56
- For the code documentation tasks, different models achieves the following results on different programming languages (in BLEU score):
57
-
58
- Test results :
59
-
60
- | Language / Model | Java |
61
- | -------------------- | :------------: |
62
- | CodeTrans-ST-Small | 68.71 |
63
- | CodeTrans-ST-Base | 70.45 |
64
- | CodeTrans-TF-Small | 68.90 |
65
- | CodeTrans-TF-Base | 72.11 |
66
- | CodeTrans-TF-Large | 73.26 |
67
- | CodeTrans-MT-Small | 58.43 |
68
- | CodeTrans-MT-Base | 67.97 |
69
- | CodeTrans-MT-Large | 72.29 |
70
- | CodeTrans-MT-TF-Small | 69.29 |
71
- | CodeTrans-MT-TF-Base | 72.89 |
72
- | CodeTrans-MT-TF-Large | **73.39** |
73
- | State of the art | 54.42 |
74
-
75
-
76
-
77
- > Created by [Ahmed Elnaggar](https://twitter.com/Elnaggar_AI) | [LinkedIn](https://www.linkedin.com/in/prof-ahmed-elnaggar/) and Wei Ding | [LinkedIn](https://www.linkedin.com/in/wei-ding-92561270/)
78
-
 
1
  ---
2
+ language: code
 
 
 
 
3
  ---