File size: 2,935 Bytes
2b27568
 
 
 
 
 
 
 
24c6bfd
 
 
 
 
 
 
 
 
 
8882d27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24c6bfd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8882d27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
title: README
emoji: 🐨
colorFrom: blue
colorTo: red
sdk: static
pinned: false
---
# πŸ” OverseerAI

## Mission
OverseerAI is dedicated to advancing open-source AI safety and content moderation tools. We develop state-of-the-art models and datasets for brand safety classification, making content moderation more accessible and efficient for developers and organizations.

## 🌟 Our Projects

### Datasets
#### [BrandSafe-16k](https://huggingface.co/datasets/OverseerAI/BrandSafe-16k)
A comprehensive dataset for training brand safety classification models, featuring 16 distinct risk categories:

| Category | Description |
|----------|-------------|
| B1-PROFANITY | Explicit language and cursing |
| B2-OFFENSIVE_SLANG | Informal offensive terms |
| B3-COMPETITOR | Competitive brand mentions |
| B4-BRAND_CRITICISM | Negative brand commentary |
| B5-MISLEADING | Deceptive or false information |
| B6-POLITICAL | Political content and discussions |
| B7-RELIGIOUS | Religious themes and references |
| B8-CONTROVERSIAL | Contentious topics |
| B9-ADULT | Adult or mature content |
| B10-VIOLENCE | Violent themes or descriptions |
| B11-SUBSTANCE | Drug and alcohol references |
| B12-HATE | Hate speech and discrimination |
| B13-STEREOTYPE | Stereotypical content |
| B14-BIAS | Biased viewpoints |
| B15-UNPROFESSIONAL | Unprofessional content |
| B16-MANIPULATION | Manipulative content |

### Models

#### [vision-1](https://huggingface.co/OverseerAI/vision-1)
Our flagship model for brand safety classification:
- Architecture: Meta Llama 3.1 (15GB)
- Full precision model optimized for high accuracy
- Trained on BrandSafe-16k dataset
- Ideal for production deployments with high-end GPU resources

#### [vision-1-mini](https://huggingface.co/OverseerAI/vision-1-mini)
A lightweight, optimized version of vision-1:
- Size: 4.58 GiB
- Architecture: Llama 3.1 8B
- Quantization: GGUF V3 (Q4_K)
- Optimized for Apple Silicon
- Fast load time: 3.27s
- Efficient memory usage: 4552.80 MiB CPU / 132.50 MiB Metal
- Perfect for local deployment and smaller compute resources

## πŸ’‘ Use Cases
- Content moderation for social media platforms
- Brand safety monitoring for advertising
- User-generated content filtering
- Real-time content classification
- Safe content recommendation systems

## 🀝 Contributing
We welcome contributions from the community! Whether it's:
- Improving model accuracy
- Expanding the dataset
- Optimizing for different hardware
- Adding new classification categories
- Reporting issues or suggesting improvements

## πŸ“« Contact
- GitHub: [OverseerAI](https://github.com/OverseerAI)
- HuggingFace: [OverseerAI](https://huggingface.co/OverseerAI)

## πŸ“œ License
Our models are released under the Llama 3.1 license, and our datasets are available under open-source licenses to promote accessibility and innovation in AI safety.

---
*OverseerAI - Making AI Safety Accessible and Efficient*