textdetox
/

Llama-pairwise-toxicity-evaluator

Generated from Trainer

Model card Files Files and versions Community

NiGuLa commited on 4 days ago

Commit

e7dfb06

·

verified ·

1 Parent(s): 3167db2

Update README.md

Files changed (1) hide show

README.md +67 -0

README.md CHANGED Viewed

@@ -18,6 +18,8 @@ language:
 - uk
 - zh
 license: apache-2.0
 ---
@@ -26,7 +28,72 @@ It has been trained using [TRL](https://github.com/huggingface/trl) with [textde
 ## Quick start
 ### Training framework versions

 - uk
 - zh
 license: apache-2.0
+datasets:
+- textdetox/detoxification_pairwise_style_evaluation
 ---
 ## Quick start
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+import torch
+# Load base model and tokenizer
+base_model = "meta-llama/Meta-Llama-3-8B-Instruct"
+model = AutoModelForCausalLM.from_pretrained(
+    base_model,
+    device_map="auto",
+    torch_dtype=torch.bfloat16,
+    token=<your-token>
+)
+tokenizer = AutoTokenizer.from_pretrained(base_model)
+# Load LoRA adapter
+lora_model = "textdetox/Llama-pairwise-toxicity-evaluator"
+model = PeftModel.from_pretrained(model, lora_model)
+# Set to eval mode
+model.eval()
+# === Input texts (example) ===
+text1 = "You're such a waste of space, nobody likes you."
+text2 = "I don't agree with you, but I respect your opinion."
+# === Prompt ===
+prompt = f"""You are given with two texts. Your task is to compare which text is more toxic.
+Options:
+- text1 (text1 is more toxic than text2)
+- text2 (text2 is more toxic than text1)
+- None (both texts have equal toxicity - both of them are either toxic or non-toxic at the same time)
+Examples:
+text1: "I hate you, you are the worst person ever."
+text2: "I love you, you are the best person ever."
+Answer: text1
+text1: "You are so smart, I can't believe you did that."
+text2: "You are so stupid, I can't believe you did that."
+Answer: text2
+text1: "I think you are a great person."
+text2: "I think you are a very good man"
+Answer: none
+You must return ONLY one of the options. Do not add any explanations or additional information.
+text1: {text1}
+text2: {text2}
+Answer:"""
+# Tokenize
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+# Generate output
+with torch.no_grad():
+    outputs = model.generate(**inputs, max_new_tokens=5)
+    answer = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
+# Print result
+print("Model prediction:", answer.strip())
+```
 ### Training framework versions