King-Harry
/

Ninja-Masker-2-PII-Redaction

Model card Files Files and versions Community

King-Harry commited on Aug 22, 2024

Commit

1c9818f

·

verified ·

1 Parent(s): 7ce6937

Update README.md

Files changed (1) hide show

README.md +40 -7

README.md CHANGED Viewed

@@ -60,18 +60,51 @@ The model is designed for responsible data management, ensuring that sensitive i
 To use this model, you can load it from the Hugging Face Hub and integrate it into your Python or API-based applications. Below is an example of how to load and use the model:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model_name = "King-Harry/Ninja-Masker-2-PII-Redaction"
-model = AutoModelForCausalLM.from_pretrained(model_name)
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-input_text = "Write an email to Kendra Harvey at [email protected] summarizing the key findings from a recent cognitive therapy conference."
-inputs = tokenizer(input_text, return_tensors="pt")
-outputs = model.generate(**inputs, max_new_tokens=64)
-redacted_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
-print(redacted_text)
 ```
 ### Citation

 To use this model, you can load it from the Hugging Face Hub and integrate it into your Python or API-based applications. Below is an example of how to load and use the model:
 ```python
+# Install necessary packages
+!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
+!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
 from transformers import AutoModelForCausalLM, AutoTokenizer
+from unsloth import FastLanguageModel
+# Load the fine-tuned model from Hugging Face Hub
 model_name = "King-Harry/Ninja-Masker-2-PII-Redaction"
+model, tokenizer = FastLanguageModel.from_pretrained(model_name, load_in_4bit=True)
+# Ensure the model is ready for inference
+FastLanguageModel.for_inference(model)
+# Define the Alpaca-style prompt
+alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+{}
+### Input:
+{}
+### Response:
+{}"""
+# Define the input text using the Alpaca prompt
+inputs = tokenizer(
+    [
+        alpaca_prompt.format(
+            "Replace all the PII from this text and use only the following tags: [FULLNAME], [NAME], [EMAIL], [CITY], [JOBAREA], [FIRSTNAME], [STATE], [STREETADDRESS], [URL], [USERNAME], [NUMBER], [JOBTITLE], [LASTNAME], [ACCOUNTNUMBER], [AMOUNT], [BUILDINGNUMBER], [ZIPCODE], [CURRENCY], [STREET], [PASSWORD], [IPV4], [CURRENCYNAME], [ACCOUNTNAME], [GENDER], [COUNTY], [CREDITCARDNUMBER], [DISPLAYNAME], [IPV6], [USERAGENT], [BITCOINADDRESS], [CURRENCYCODE], [JOBTYPE], [IBAN], [ETHEREUMADDRESS], [MAC], [IP], [CREDITCARDISSUER], [CREDITCARDCVV], [MASKEDNUMBER], [SEX], [JOBDESCRIPTOR]", # instruction
+            "Write an email to Kendra Harvey at [email protected] summarizing the key findings from a recent cognitive therapy conference they attended.", # input
+            ""  # output - leave this blank for generation!
+        )
+    ],
+    return_tensors="pt"
+).to("cuda")
+# Generate the redacted output
+outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True)
+# Decode and print the output
+redacted_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)
+print(redacted_text[0])
 ```
 ### Citation