Planned Giving Propensity Model
A machine learning solution to optimize planned giving donor targeting for the National Parks Conservation Association (NPCA).
Note: This model is not currently deployed or downloadable due to data privacy constraints. This repository shares the modeling approach, evaluation strategy, and relevant pipeline components for reproducibility and educational use.
Project Overview
This project implements a Random Forest classifier to identify potential planned giving donors, with the goal of improving mailing efficiency and response rates. The model processes donor data through Snowflakeβs computing infrastructure and uses SMOTE to handle class imbalance.
Key Results
- PR-AUC: 0.88 β strong performance on imbalanced data
- F1 Score: 0.8125
- Precision: 0.7558
- Recall: 0.8784 β high capture rate of known planned givers
- 1,019 new high-potential donor predictions for targeted outreach
Technical Implementation
Data Pipeline
- Donor data extracted from CRM into Snowflake
- Modular Python scripts for feature engineering and cleaning
- SMOTE oversampling to address class imbalance
Machine Learning
- Random Forest classifier with
scikit-learn
- Stratified cross-validation and grid search
- Multiple imputation strategies (MICE, mean, median)
- Key temporal features (e.g., time since last gift)
π Training Script
π Evaluation Notebook
Model Performance Insights
Post-modeling analysis validated predictions against known donor engagement indicators:
- 66.3% of predicted donors were already flagged as prospects by fundraisers
- 37.6% are major donor households
- 18% are members of the Mather Legacy Society
Top 5 Most Important Features
- Highest Previous Contribution (22.8%)
- Most Recent Contribution (20.1%)
- Years Since HPC Gift (14.6%)
- Total Amount (14.3%)
- Years Since MRC Gift (11.2%)
Demographics of Predicted Donors
- Average age: 69
- Giving history: 16 years (on average)
- Median total giving: $10,932
- Average number of transactions: 18
Tools and Technologies
scikit-learn
,pandas
,numpy
- Snowflake
imbalanced-learn
,matplotlib
,seaborn
Repository Structure
βββ model/
β βββ split_SMOTE_crossval.py # ML model executed on Snowflake
β βββ snowflake_model_evaluation.py # Model evaluation and visualization
βββ predictions_analyzed/ # Post-modeling analysis
β βββ predictions_analyzed.ipynb # Model concurrence evaluation
βββ requirements.txt
βββ README.md
Potential Future Improvements
- Schedule automated data refresh and model retraining
- Incorporate additional feature engineering
- Develop dashboard for tracking model performance
Note: Full project repository: GitHub β dbouquin/bequest_modeling