Lingo-IITGN
commited on
Commit
•
8fc0b91
1
Parent(s):
6dd3352
Update README.md
Browse files
README.md
CHANGED
@@ -81,16 +81,38 @@ This model described is a research preview and is under ongoing iterative updati
|
|
81 |
|
82 |
### Results
|
83 |
|
84 |
-
|
85 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
|:-----------:|:---------:|:------:|
|
87 |
-
| ganga-1b |
|
88 |
-
| pragna-1b |
|
89 |
-
| bloom-1b1 |
|
90 |
-
| bloom-1b7 |
|
91 |
-
| gemma-2b |
|
92 |
-
| bloom-3b |
|
93 |
-
| airavata-7b |
|
|
|
|
|
94 |
|
95 |
|
96 |
#### Summary
|
@@ -104,15 +126,15 @@ This model described is a research preview and is under ongoing iterative updati
|
|
104 |
Ganga-1b is a decoder-only transformer model, featuring the following specifications:
|
105 |
|
106 |
|
107 |
-
*
|
108 |
-
*
|
109 |
-
* Embedding dimension:
|
110 |
-
* Vocabulary size:
|
111 |
* Sliding window: 512
|
112 |
* Intermediate dimension: 716
|
113 |
|
114 |
|
115 |
## Model Card Contact
|
116 |
|
117 |
-
[Lingo Research Labs at IIT Gandhinagar, India](https://labs.iitgn.ac.in/lingo/)
|
118 |
Mail at: [[email protected]]([email protected])
|
|
|
81 |
|
82 |
### Results
|
83 |
|
84 |
+
<details open>
|
85 |
+
<summary>Tokenizers Results</summary>
|
86 |
+
<br>
|
87 |
+
|
88 |
+
| Model | Fertility |
|
89 |
+
|:-----------:|:---------:|
|
90 |
+
| ***ganga-1b*** | ***1.12*** |
|
91 |
+
| pragna-1b | 1.58 |
|
92 |
+
| bloom-1b1 | 1.27 |
|
93 |
+
| bloom-1b7 | 1.27 |
|
94 |
+
| gemma-2b | 1.89 |
|
95 |
+
| bloom-3b | 1.27 |
|
96 |
+
| airavata-7b | 1.69 |
|
97 |
+
|
98 |
+
</details>
|
99 |
+
|
100 |
+
|
101 |
+
<details open>
|
102 |
+
<summary>Metrics</summary>
|
103 |
+
<br>
|
104 |
+
|
105 |
+
| Model | PPL_{Ours | PPL_{Airawat} |
|
106 |
|:-----------:|:---------:|:------:|
|
107 |
+
| ganga-1b | | 34.85 |
|
108 |
+
| pragna-1b | | 12.74 |
|
109 |
+
| bloom-1b1 | | 33.39 |
|
110 |
+
| bloom-1b7 | | 26.63 |
|
111 |
+
| gemma-2b | | 41.67 |
|
112 |
+
| bloom-3b | | 23.77 |
|
113 |
+
| airavata-7b | | 46.24 |
|
114 |
+
|
115 |
+
</details>
|
116 |
|
117 |
|
118 |
#### Summary
|
|
|
126 |
Ganga-1b is a decoder-only transformer model, featuring the following specifications:
|
127 |
|
128 |
|
129 |
+
* Layers: 16
|
130 |
+
* Attention heads: 32
|
131 |
+
* Embedding dimension: 2,048
|
132 |
+
* Vocabulary size: 30,000
|
133 |
* Sliding window: 512
|
134 |
* Intermediate dimension: 716
|
135 |
|
136 |
|
137 |
## Model Card Contact
|
138 |
|
139 |
+
[Lingo Research Labs at IIT Gandhinagar, India](https://labs.iitgn.ac.in/lingo/) </br>
|
140 |
Mail at: [[email protected]]([email protected])
|