Collections of publicly available datasets
Indic Verse
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
π IndicVerse
IndicVerse is dedicated to advancing natural language processing (NLP) capabilities for Indic languages. Our mission is to bridge the gap in NLP research for low-resource Indic languages by providing high-quality datasets, pre-trained models, and tools tailored for diverse linguistic needs.
π What We Do
- Datasets: Creation and publication of datasets for various NLP tasks, including translation, classification, and generation, with a focus on Indic languages.
- Models: Development of state-of-the-art NLP models fine-tuned for Indic languages, leveraging techniques like PEFT and LoRA.
- Research: Conducting and sharing research to solve key challenges in Indic NLP, including transliteration, low-resource learning, and domain-specific applications.
π Featured Projects
- Hellaswag-Telugu: A Telugu version of the Hellaswag dataset for advanced evaluation.
- Indic Language Translation and Transliteration: Custom tools and APIs for translation and mixed transliteration (Telugu-English).
π οΈ How to Contribute
We welcome contributions! Whether youβre interested in annotating data, building models, or sharing insights, feel free to get in touch.
π Links
π Citation
If you use our datasets or models in your research, please cite us as follows:
@misc{IndicVerse2024,
author = {Nikhil Chowdary Paleti and Divi Eswar Chowdary},
title = {Indic Verse: Datasets and Models for Advancing Indic Languages in NLP},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/IndicVerse}
}
Collections
3
models
5
datasets
19
indiehackers/hellaswag-telugu-custom-2k
Viewer
β’
Updated
β’
2.01k
β’
86
β’
2
indiehackers/hellaswag-telugu-custom
Viewer
β’
Updated
β’
10k
β’
34
indiehackers/winogrande_debiased-telugu_filtered
Viewer
β’
Updated
β’
11.7k
β’
39
indiehackers/databricks-dolly-15k-Telugu-romanized
Viewer
β’
Updated
β’
15k
β’
43
indiehackers/winogrande_debiased-telugu-romanized-nodict
Viewer
β’
Updated
β’
12.3k
β’
41
indiehackers/winogrande_debiased-telugu-romanized
Viewer
β’
Updated
β’
12.3k
β’
47
indiehackers/winogrande_debiased-telugu
Viewer
β’
Updated
β’
12.3k
β’
33
indiehackers/telugu_romanized_2000_mistral
Viewer
β’
Updated
β’
127k
β’
36
indiehackers/telugu_romanized_2048_mistral
Viewer
β’
Updated
β’
125k
β’
36
indiehackers/telugu_romanized
Viewer
β’
Updated
β’
87.9k
β’
36