Planned Giving Propensity Model

A machine learning solution to optimize planned giving donor targeting for the National Parks Conservation Association (NPCA).

Note: This model is not currently deployed or downloadable due to data privacy constraints. This repository shares the modeling approach, evaluation strategy, and relevant pipeline components for reproducibility and educational use.

Project Overview

This project implements a Random Forest classifier to identify potential planned giving donors, with the goal of improving mailing efficiency and response rates. The model processes donor data through Snowflake’s computing infrastructure and uses SMOTE to handle class imbalance.

Key Results

PR-AUC: 0.88 — strong performance on imbalanced data
F1 Score: 0.8125
Precision: 0.7558
Recall: 0.8784 — high capture rate of known planned givers
1,019 new high-potential donor predictions for targeted outreach

Technical Implementation

Data Pipeline

Donor data extracted from CRM into Snowflake
Modular Python scripts for feature engineering and cleaning
SMOTE oversampling to address class imbalance

Machine Learning

Random Forest classifier with scikit-learn
Stratified cross-validation and grid search
Multiple imputation strategies (MICE, mean, median)
Key temporal features (e.g., time since last gift)

📂 Training Script
📓 Evaluation Notebook

Model Performance Insights

Post-modeling analysis validated predictions against known donor engagement indicators:

66.3% of predicted donors were already flagged as prospects by fundraisers
37.6% are major donor households
18% are members of the Mather Legacy Society

Top 5 Most Important Features

Highest Previous Contribution (22.8%)
Most Recent Contribution (20.1%)
Years Since HPC Gift (14.6%)
Total Amount (14.3%)
Years Since MRC Gift (11.2%)

Demographics of Predicted Donors

Average age: 69
Giving history: 16 years (on average)
Median total giving: $10,932
Average number of transactions: 18

Tools and Technologies

scikit-learn, pandas, numpy
Snowflake
imbalanced-learn, matplotlib, seaborn

Repository Structure

├── model/                      
│   ├── split_SMOTE_crossval.py        # ML model executed on Snowflake
│   └── snowflake_model_evaluation.py  # Model evaluation and visualization
├── predictions_analyzed/              # Post-modeling analysis
│   ├── predictions_analyzed.ipynb     # Model concurrence evaluation
├── requirements.txt
└── README.md

Potential Future Improvements

Schedule automated data refresh and model retraining
Incorporate additional feature engineering
Develop dashboard for tracking model performance

Note: Full project repository: GitHub – dbouquin/bequest_modeling