---
license: mit
library_name: sklearn
tags:
- sklearn
- skops
- text-classification
model_format: pickle
model_file: model.pkl
---
# Model description
Suicide Detection text classification model.
PYTHON 3.10 ONLY
## Training Procedure
Trained using 0.7 of the the Suicide and Depression Detection dataset (https://www.kaggle.com/datasets/nikhileswarkomati/suicide-watch)
The model vectorises each text using a trained tfidf vectorizer and then classifies using xgboost.
See main.py for further details.
### Hyperparameters
Click to expand
| Hyperparameter | Value |
|-------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| memory | |
| steps | [('tfidf', TfidfVectorizer(min_df=100, ngram_range=(1, 3),
preprocessor=
colsample_bylevel=None, colsample_bynode=None,
colsample_bytree=None, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,
gamma=None, grow_policy=None, importance_type=None,
interaction_constraints=None, learning_rate=None, max_bin=None,
max_cat_threshold=None, max_cat_to_onehot=None,
max_delta_step=None, max_depth=None, max_leaves=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
multi_strategy=None, n_estimators=None, n_jobs=None,
num_parallel_tree=None, random_state=None, ...))] |
| verbose | True |
| tfidf | TfidfVectorizer(min_df=100, ngram_range=(1, 3),
preprocessor=
colsample_bylevel=None, colsample_bynode=None,
colsample_bytree=None, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,
gamma=None, grow_policy=None, importance_type=None,
interaction_constraints=None, learning_rate=None, max_bin=None,
max_cat_threshold=None, max_cat_to_onehot=None,
max_delta_step=None, max_depth=None, max_leaves=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
multi_strategy=None, n_estimators=None, n_jobs=None,
num_parallel_tree=None, random_state=None, ...) |
| tfidf__analyzer | word |
| tfidf__binary | False |
| tfidf__decode_error | strict |
| tfidf__dtype |
Pipeline(steps=[('tfidf',TfidfVectorizer(min_df=100, ngram_range=(1, 3),preprocessor=<function preprocessor at 0x7f8d443a30a0>)),('classifier',XGBClassifier(base_score=None, booster=None, callbacks=None,colsample_bylevel=None, colsample_bynode=None,colsample_bytree=None, device=None,early_stopping_rounds=None,enable_categorical=False, eval_metric=None,featur...importance_type=None,interaction_constraints=None, learning_rate=None,max_bin=None, max_cat_threshold=None,max_cat_to_onehot=None, max_delta_step=None,max_depth=None, max_leaves=None,min_child_weight=None, missing=nan,monotone_constraints=None, multi_strategy=None,n_estimators=None, n_jobs=None,num_parallel_tree=None, random_state=None, ...))],verbose=True)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(steps=[('tfidf',TfidfVectorizer(min_df=100, ngram_range=(1, 3),preprocessor=<function preprocessor at 0x7f8d443a30a0>)),('classifier',XGBClassifier(base_score=None, booster=None, callbacks=None,colsample_bylevel=None, colsample_bynode=None,colsample_bytree=None, device=None,early_stopping_rounds=None,enable_categorical=False, eval_metric=None,featur...importance_type=None,interaction_constraints=None, learning_rate=None,max_bin=None, max_cat_threshold=None,max_cat_to_onehot=None, max_delta_step=None,max_depth=None, max_leaves=None,min_child_weight=None, missing=nan,monotone_constraints=None, multi_strategy=None,n_estimators=None, n_jobs=None,num_parallel_tree=None, random_state=None, ...))],verbose=True)
TfidfVectorizer(min_df=100, ngram_range=(1, 3),preprocessor=<function preprocessor at 0x7f8d443a30a0>)
XGBClassifier(base_score=None, booster=None, callbacks=None,colsample_bylevel=None, colsample_bynode=None,colsample_bytree=None, device=None, early_stopping_rounds=None,enable_categorical=False, eval_metric=None, feature_types=None,gamma=None, grow_policy=None, importance_type=None,interaction_constraints=None, learning_rate=None, max_bin=None,max_cat_threshold=None, max_cat_to_onehot=None,max_delta_step=None, max_depth=None, max_leaves=None,min_child_weight=None, missing=nan, monotone_constraints=None,multi_strategy=None, n_estimators=None, n_jobs=None,num_parallel_tree=None, random_state=None, ...)