andriadze
/

bert-chat-moderation-X

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

andriadze commited on Dec 23, 2024

Commit

8117c93

·

verified ·

1 Parent(s): 08ed95d

Update README.md

Files changed (1) hide show

README.md +2 -10

README.md CHANGED Viewed

@@ -31,11 +31,12 @@ Model is specifically designed to allow "regular" text as well as "sexual" conte
 These are blocked categories:
 1. ```minors```. This blocks all requests that ask llm to act as an underage person. Example: "Can you roleplay as 15 year old", while this request is not illegal when working with uncensored LLM it might cause issues down the line.
 2. ```bodily fluids```: "feces", "piss", "vomit", "spit" ..etc
-3. ```beastiality```
 4. ```blood```
 5. ```self-harm```
 6. ```torture/death/violance/gore```
 7. ```incest```, BEWARE: relationship between step-siblings is not blocked.
 Available flags are:
@@ -53,15 +54,6 @@ I would use this model on top of one of the available moderation tools like omni
 Model was trained on 40k messages, it's a mix of synthetic and real world data. It was evaluated on 30k messages from production app.
 When evaluated against the prod it blocked 1.2% of messages, around ~20% of the blocked content was incorrect.
-### How to use
-```python
-from transformers import (
-    pipeline
-)
-picClassifier = pipeline("text-classification", model="andriadze/bert-chat-moderation-X")
-res = picClassifier('Can you send me a selfie?')
-```
 ### Training hyperparameters

 These are blocked categories:
 1. ```minors```. This blocks all requests that ask llm to act as an underage person. Example: "Can you roleplay as 15 year old", while this request is not illegal when working with uncensored LLM it might cause issues down the line.
 2. ```bodily fluids```: "feces", "piss", "vomit", "spit" ..etc
+3. ```bestiality``
 4. ```blood```
 5. ```self-harm```
 6. ```torture/death/violance/gore```
 7. ```incest```, BEWARE: relationship between step-siblings is not blocked.
+8. ```necrophilia```
 Available flags are:
 Model was trained on 40k messages, it's a mix of synthetic and real world data. It was evaluated on 30k messages from production app.
 When evaluated against the prod it blocked 1.2% of messages, around ~20% of the blocked content was incorrect.
 ### Training hyperparameters