Lingo-IITGN commited on
Commit
edb46e9
1 Parent(s): ff6a0a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -31,19 +31,18 @@ widget:
31
 
32
  # Model Card for Ganga-1b! 🌊
33
 
34
- The base model **``Ganga-1b``** trained on a monolingual **Hindi** language dataset as part of ***Project Unity***. <br> *(The first pre-trained Hindi model by any academic research lab in India 🇮🇳!)**
35
 
36
-
37
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/667b8f8ba271fc5a8e6929de/jG3tZnGPvH6vcGrvxO-YC.png)
38
 
39
 
40
- ## Model Details
41
 
42
 
43
 
44
  ### Model Description 📚
45
 
46
- Project Unity is an initiative aimed at addressing **India's linguistic diversity** and richness by creating a comprehensive resource that covers the country's major languages. Our goal is to achieve state-of-the-art performance in understanding and generating text in **Indian languages**.
47
  To achieve this, we train models on the monolingual regional languages of India. Our first release is the *Ganga-1B* model, *which has been trained on a large dataset of public domain web-crawled hindi language data, including news articles, web documents, books, government publications, educational materials, and social media conversations (filtered for quality)*. Additionally, the dataset has been further curated by native Indian speakers to ensure high-quality.
48
  Importantly, the **Ganga-1B** model outperforms existing open-source models that support **Indian languages**, even at sizes of up to **7 billion parameters**.
49
 
 
31
 
32
  # Model Card for Ganga-1b! 🌊
33
 
34
+ The base model **``Ganga-1b``** trained on a monolingual **Hindi** language dataset as part of ***Project Unity***. We propose the name *Ganga* 🌊 to honor the longest river flowing through the Hindi-speaking region of India 🇮🇳.
35
 
36
+ <br> *(The first pre-trained Hindi model by any academic research lab in India 🇮🇳!)**
 
37
 
38
 
39
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/667b8f8ba271fc5a8e6929de/jG3tZnGPvH6vcGrvxO-YC.png)
40
 
41
 
42
 
43
  ### Model Description 📚
44
 
45
+ **Project Unity** is an initiative aimed at addressing **India's linguistic diversity** and richness by creating a comprehensive resource that covers the country's major languages. Our goal is to achieve state-of-the-art performance in understanding and generating text in **Indian languages**.
46
  To achieve this, we train models on the monolingual regional languages of India. Our first release is the *Ganga-1B* model, *which has been trained on a large dataset of public domain web-crawled hindi language data, including news articles, web documents, books, government publications, educational materials, and social media conversations (filtered for quality)*. Additionally, the dataset has been further curated by native Indian speakers to ensure high-quality.
47
  Importantly, the **Ganga-1B** model outperforms existing open-source models that support **Indian languages**, even at sizes of up to **7 billion parameters**.
48