nvidia
/

instruction-data-guard

sarahyurick commited on Jan 24

Commit

1cbc068

verified ·

1 Parent(s): dc27674

Add title NemoCurator Instruction Data Guard

Files changed (1) hide show

README.md CHANGED Viewed

@@ -5,12 +5,14 @@ tags:
 license: other
 ---
 # Model Overview
 ## Description:
-Instruction-Data-Guard is a deep-learning classification model that helps identify LLM poisoning attacks in datasets.
 It is trained on an instruction:response dataset and LLM poisoning attacks of such data.
-Note that optimal use for Instruction-Data-Guard is for instruction:response datasets.
 ### License/Terms of Use:
 [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
@@ -60,7 +62,7 @@ v1.0  <br>
 * Synthetic <br>
 ## Evaluation Benchmarks:
-Instruction-Data-Guard is evaluated based on two overarching criteria: <br>
 * Success on identifying LLM poisoning attacks, after the model was trained on examples of the attacks. <br>
 * Success on identifying LLM poisoning attacks, but without training on examples of those attacks, at all. <br>
@@ -127,7 +129,7 @@ class InstructionDataGuardNet(torch.nn.Module, PyTorchModelHubMixin):
         x = self.sigmoid(x)
         return x
-# Load Instruction-Data-Guard classifier
 instruction_data_guard = InstructionDataGuardNet.from_pretrained("nvidia/instruction-data-guard")
 instruction_data_guard = instruction_data_guard.to(device)
 instruction_data_guard = instruction_data_guard.eval()

 license: other
 ---
+# NemoCurator Instruction Data Guard
 # Model Overview
 ## Description:
+Instruction Data Guard is a deep-learning classification model that helps identify LLM poisoning attacks in datasets.
 It is trained on an instruction:response dataset and LLM poisoning attacks of such data.
+Note that optimal use for Instruction Data Guard is for instruction:response datasets.
 ### License/Terms of Use:
 [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
 * Synthetic <br>
 ## Evaluation Benchmarks:
+Instruction Data Guard is evaluated based on two overarching criteria: <br>
 * Success on identifying LLM poisoning attacks, after the model was trained on examples of the attacks. <br>
 * Success on identifying LLM poisoning attacks, but without training on examples of those attacks, at all. <br>
         x = self.sigmoid(x)
         return x
+# Load Instruction Data Guard classifier
 instruction_data_guard = InstructionDataGuardNet.from_pretrained("nvidia/instruction-data-guard")
 instruction_data_guard = instruction_data_guard.to(device)
 instruction_data_guard = instruction_data_guard.eval()