Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
tags:
|
4 |
+
- Explain code
|
5 |
+
- Code Summarization
|
6 |
+
- Summarization
|
7 |
+
|
8 |
+
license: mit
|
9 |
+
---
|
10 |
+
|
11 |
+
|
12 |
+
# Gemini
|
13 |
+
|
14 |
+
For in-depth understanding of our model and methods, please see our blog [here](https://www.describe-ai.com/gemini)
|
15 |
+
|
16 |
+
## Model description
|
17 |
+
|
18 |
+
Gemini is a transformer based on Google's T5 model. The model is pre-trained on approximately 800k code/description pairs and then fine-tuned on 10k higher-level explanations that were synthetically generated. Gemini is capable of summarization/explaining short to medium code snippets in:
|
19 |
+
|
20 |
+
- Python
|
21 |
+
- Javascript (mostly vanilla JS, however, it can handle frameworks like React as well)
|
22 |
+
- Java
|
23 |
+
- Ruby
|
24 |
+
- Go
|
25 |
+
|
26 |
+
And outputs a description in English.
|
27 |
+
|
28 |
+
## Intended uses & limitations
|
29 |
+
|
30 |
+
Gemini without any additional fine-tuning is capable of explaining code in a sentence or two and typically performs best in Python and Javascript. We recommend using Gemini for either simple code explanation, documentation or producing more synthetic data to improve its explanations.
|
31 |
+
|
32 |
+
### How to use
|
33 |
+
|
34 |
+
You can use this model directly with a pipeline for Text2Text generation, as shown below:
|
35 |
+
|
36 |
+
```python
|
37 |
+
from transformers import pipeline, set_seed
|
38 |
+
|
39 |
+
summarizer = pipeline('text2text-generation', model='describeai/gemini')
|
40 |
+
code = "print('hello world!')"
|
41 |
+
|
42 |
+
response = "Summarized code: "+ summarizer(code, max_length=100, num_beams=3)
|
43 |
+
print(response)
|
44 |
+
|
45 |
+
```
|
46 |
+
|
47 |
+
Which should yield something along the lines of:
|
48 |
+
|
49 |
+
```
|
50 |
+
Summarized code: The following code is greeting the world.
|
51 |
+
```
|
52 |
+
|
53 |
+
### Model sizes
|
54 |
+
|
55 |
+
Gemini: 770 Million Parameters
|
56 |
+
Gemini-Small (this repo): 220 Million Parameters
|
57 |
+
|
58 |
+
|
59 |
+
### Limitations
|
60 |
+
|
61 |
+
Typically, Gemini may produce overly simplistic descriptions that don't encompass the entire code snippet. We suspect with more training data, this could be circumvented and will produce better results.
|
62 |
+
|
63 |
+
|
64 |
+
### About Us
|
65 |
+
|
66 |
+
A Describe.ai, we are focused on building Artificial Intelligence systems that can understand language as well as humans. While a long path, we plan to contribute our findings to our API to the Open Source community.
|