Update README.md
#3
by
Chris-Alexiuk
- opened
README.md
CHANGED
@@ -23,9 +23,9 @@ Throughout the alignment process, we relied on only approximately 20K human-anno
|
|
23 |
This results in a model that is aligned for human chat preferences, improvements in mathematical reasoning, coding and instruction-following, and is capable of generating high quality synthetic data for a variety of use cases.
|
24 |
|
25 |
Under the NVIDIA Open Model License, NVIDIA confirms:
|
26 |
-
Models are commercially usable.
|
27 |
-
You are free to create and distribute Derivative Models.
|
28 |
-
NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
|
29 |
|
30 |
### License:
|
31 |
|
@@ -309,9 +309,9 @@ Evaluated using the CantTalkAboutThis Dataset as introduced in the [CantTalkAbou
|
|
309 |
### Adversarial Testing and Red Teaming Efforts
|
310 |
|
311 |
The Nemotron-4 340B-Instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:
|
312 |
-
[Garak](https://docs.garak.ai/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
|
313 |
-
[AEGIS](https://arxiv.org/pdf/2404.05993), is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
|
314 |
-
Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
|
315 |
|
316 |
### Limitations
|
317 |
|
|
|
23 |
This results in a model that is aligned for human chat preferences, improvements in mathematical reasoning, coding and instruction-following, and is capable of generating high quality synthetic data for a variety of use cases.
|
24 |
|
25 |
Under the NVIDIA Open Model License, NVIDIA confirms:
|
26 |
+
- Models are commercially usable.
|
27 |
+
- You are free to create and distribute Derivative Models.
|
28 |
+
- NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
|
29 |
|
30 |
### License:
|
31 |
|
|
|
309 |
### Adversarial Testing and Red Teaming Efforts
|
310 |
|
311 |
The Nemotron-4 340B-Instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:
|
312 |
+
- [Garak](https://docs.garak.ai/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
|
313 |
+
- [AEGIS](https://arxiv.org/pdf/2404.05993), is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
|
314 |
+
- Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
|
315 |
|
316 |
### Limitations
|
317 |
|