|
--- |
|
library_name: transformers |
|
tags: |
|
- big-five |
|
- regression |
|
- psychology |
|
- transformer |
|
- text-analysis |
|
license: mit |
|
datasets: |
|
- jingjietan/essays-big5 |
|
language: |
|
- en |
|
--- |
|
|
|
# 🧠 Big Five Personality Regression Model |
|
|
|
This model predicts Big Five personality traits — Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism — from English free-text inputs. The output is a set of five continuous values between 0.0 and 1.0, corresponding to each trait. |
|
|
|
--- |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
- **Developed by:** [vladinc](https://huggingface.co/vladinc) |
|
- **Model type:** `distilbert-base-uncased`, fine-tuned |
|
- **Language(s):** English |
|
- **License:** MIT |
|
- **Finetuned from model:** `distilbert-base-uncased` |
|
- **Trained on:** ~8,700 essays from the `jingjietan/essays-big5` dataset |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [https://huggingface.co/vladinc/bigfive-regression-model](https://huggingface.co/vladinc/bigfive-regression-model) |
|
|
|
--- |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
This model can be used to estimate personality profiles from user-written text. It may be useful in psychological analysis, conversational profiling, or educational feedback systems. |
|
|
|
### Out-of-Scope Use |
|
|
|
- Not intended for clinical or diagnostic use. |
|
- Should not be used to make hiring, legal, or psychological decisions. |
|
- Not validated across cultures or demographic groups. |
|
|
|
--- |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
- Trained on essay data; generalizability to tweets, messages, or other short-form texts may be limited. |
|
- Traits like Extraversion and Neuroticism had higher validation MSE, suggesting reduced predictive reliability. |
|
- Cultural and linguistic biases in training data may influence predictions. |
|
|
|
### Recommendations |
|
|
|
Do not use predictions from this model in isolation. Supplement with human judgment and/or other assessment tools. |
|
|
|
--- |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
model = AutoModelForSequenceClassification.from_pretrained("vladinc/bigfive-regression-model") |
|
tokenizer = AutoTokenizer.from_pretrained("vladinc/bigfive-regression-model") |
|
|
|
text = "I enjoy reflecting on abstract concepts and trying new things." |
|
inputs = tokenizer(text, return_tensors="pt") |
|
outputs = model(**inputs) |
|
|
|
print(outputs.logits) # 5 float scores between 0.0 and 1.0 |
|
|
|
Training Details |
|
Training Data |
|
Dataset: jingjietan/essays-big5 |
|
|
|
Format: Essay text + 5 numeric labels for personality traits |
|
|
|
Training Procedure |
|
Epochs: 3 |
|
|
|
Batch size: 8 |
|
|
|
Learning rate: 2e-5 |
|
|
|
Loss Function: Mean Squared Error |
|
|
|
Metric for Best Model: MSE on Openness |
|
|
|
Evaluation |
|
Metrics |
|
Trait Validation MSE |
|
Openness 0.324 |
|
Conscientiousness 0.537 |
|
Extraversion 0.680 |
|
Agreeableness 0.441 |
|
Neuroticism 0.564 |
|
|
|
Citation |
|
If you use this model, please cite it: |
|
|
|
BibTeX: |
|
|
|
bibtex |
|
Copy |
|
Edit |
|
@misc{vladinc2025bigfive, |
|
title={Big Five Personality Regression Model}, |
|
author={vladinc}, |
|
year={2025}, |
|
howpublished={\\url{https://huggingface.co/vladinc/bigfive-regression-model}} |
|
} |
|
Contact |
|
If you have questions or suggestions, feel free to reach out via the Hugging Face profile. |