maxsonderby commited on
Commit
8a78029
·
verified ·
1 Parent(s): 9a8de1d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +60 -3
README.md CHANGED
@@ -1,3 +1,60 @@
1
- ---
2
- license: llama3.1
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # VISION-1: Content Safety Analysis Model
2
+
3
+ VISION-1 is a fine-tuned version of Llama 3.1 8B Instruct, specialized for content safety analysis and moderation. The model is trained to identify and analyze potential safety concerns in text content, including scams, fraud, harmful content, and inappropriate material.
4
+
5
+ ## Model Details
6
+
7
+ - **Base Model**: Llama 3.1 8B Instruct
8
+ - **Training Data**: Specialized safety and content moderation dataset
9
+ - **Model Type**: Decoder-only transformer
10
+ - **Parameters**: 8 billion
11
+ - **Training Infrastructure**: 2x NVIDIA H100 SXM GPUs
12
+ - **License**: Same as base model
13
+
14
+ ## Usage
15
+
16
+ ```python
17
+ from transformers import AutoTokenizer, AutoModelForCausalLM
18
+
19
+ # Load model and tokenizer
20
+ tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
21
+ model = AutoModelForCausalLM.from_pretrained("OverseerAI/VISION-1")
22
+
23
+ # Format prompt
24
+ prompt = "Analyze the following content for safety concerns: 'Click here to win a free iPhone! Just enter your credit card details.'"
25
+ formatted_prompt = f"<s>[INST] {prompt} [/INST]"
26
+
27
+ # Generate response
28
+ inputs = tokenizer(formatted_prompt, return_tensors="pt", padding=True)
29
+ outputs = model.generate(**inputs, max_new_tokens=128)
30
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
31
+ ```
32
+
33
+ ## Training Details
34
+
35
+ - **Training Type**: Fine-tuning
36
+ - **Framework**: PyTorch with DeepSpeed
37
+ - **Training Data**: Specialized dataset focused on content safety
38
+ - **Hardware**: 2x NVIDIA H100 SXM GPUs
39
+ - **Training Time**: ~4 epochs
40
+
41
+ ## Intended Use
42
+
43
+ - Content moderation
44
+ - Safety analysis
45
+ - Fraud detection
46
+ - Harmful content identification
47
+
48
+ ## Limitations
49
+
50
+ - Model outputs should be used as suggestions, not definitive judgments
51
+ - May have biases from training data
52
+ - Should be used as part of a broader content moderation strategy
53
+ - Performance may vary based on content type and context
54
+
55
+ ## Ethical Considerations
56
+
57
+ - Model should be used responsibly for content moderation
58
+ - Human oversight recommended for critical decisions
59
+ - Consider privacy implications when analyzing user content
60
+ - Regular evaluation of model outputs for bias