NeMo
PyTorch
English
Hindi
nemotron
ravirajoshi commited on
Commit
45d23e0
1 Parent(s): 90acf82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -18,15 +18,13 @@ Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more det
18
 
19
  Try this model on [build.nvidia.com](https://build.nvidia.com/nvidia/nemotron-4-mini-hindi-4b-instruct).
20
 
21
- For more details about how this model is used for [NVIDIA ACE](https://developer.nvidia.com/ace), please refer to [this blog post](https://developer.nvidia.com/blog/deploy-the-first-on-device-small-language-model-for-improved-game-character-roleplay/) and [this demo video](https://www.youtube.com/watch?v=d5z7oIXhVqg), which showcases how the model can be integrated into a video game. You can download the model checkpoint for NVIDIA AI Inference Manager (AIM) SDK from [here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ucs-ms/resources/nemotron-mini-4b-instruct).
22
-
23
  **Model Developer:** NVIDIA
24
 
25
  **Model Dates:** Nemotron-4-Mini-Hindi-4B-Instruct was trained between June 2024 and Oct 2024.
26
 
27
  ## License
28
 
29
- [NVIDIA Community Model License](https://huggingface.co/nvidia/Nemotron-4-Mini-Hindi-4B-Instruct/blob/main/nvidia-community-model-license-aug2024.pdf)
30
 
31
  ## Model Architecture
32
 
@@ -106,7 +104,7 @@ tokenizer = AutoTokenizer.from_pretrained("nvidia/Nemotron-4-Mini-Hindi-4B-Inst
106
  messages = [
107
  {"role": "user", "content": "भारत की संस्कृति के बारे में बताएं।"},
108
  ]
109
- pipe = pipeline("text-generation", model="nvidia/Nemotron-4-Mini-Hindi-4B-Instruct")
110
  pipe.tokenizer = tokenizer # You need to assign tokenizer manually
111
  pipe(messages)
112
  ```
@@ -151,7 +149,7 @@ NVIDIA believes Trustworthy AI is a shared responsibility and we have establishe
151
 
152
  If you find our work helpful, please consider citing our paper:
153
  ```
154
- @article{hindiminitron2024,
155
  title={Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus},
156
  author={Raviraj Joshi and Kanishk Singla and Anusha Kamath and Raunak Kalani and Rakesh Paul and Utkarsh Vaidya and Sanjay Singh Chauhan and Niranjan Wartikar and Eileen Long},
157
  journal={arXiv preprint arXiv:2410.14815},
 
18
 
19
  Try this model on [build.nvidia.com](https://build.nvidia.com/nvidia/nemotron-4-mini-hindi-4b-instruct).
20
 
 
 
21
  **Model Developer:** NVIDIA
22
 
23
  **Model Dates:** Nemotron-4-Mini-Hindi-4B-Instruct was trained between June 2024 and Oct 2024.
24
 
25
  ## License
26
 
27
+ Nemotron-4-Mini-Hindi-4B-Instruct is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
28
 
29
  ## Model Architecture
30
 
 
104
  messages = [
105
  {"role": "user", "content": "भारत की संस्कृति के बारे में बताएं।"},
106
  ]
107
+ pipe = pipeline("text-generation", model="nvidia/Nemotron-4-Mini-Hindi-4B-Instruct", max_new_tokens=128)
108
  pipe.tokenizer = tokenizer # You need to assign tokenizer manually
109
  pipe(messages)
110
  ```
 
149
 
150
  If you find our work helpful, please consider citing our paper:
151
  ```
152
+ @article{hindinemotron2024,
153
  title={Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus},
154
  author={Raviraj Joshi and Kanishk Singla and Anusha Kamath and Raunak Kalani and Rakesh Paul and Utkarsh Vaidya and Sanjay Singh Chauhan and Niranjan Wartikar and Eileen Long},
155
  journal={arXiv preprint arXiv:2410.14815},