Planned Giving Propensity Model

A machine learning solution to optimize planned giving donor targeting for the National Parks Conservation Association (NPCA).

Note: This model is not currently deployed or downloadable due to data privacy constraints. This repository shares the modeling approach, evaluation strategy, and relevant pipeline components for reproducibility and educational use.

Project Overview

This project implements a Random Forest classifier to identify potential planned giving donors, with the goal of improving mailing efficiency and response rates. The model processes donor data through Snowflake’s computing infrastructure and uses SMOTE to handle class imbalance.

Key Results

  • PR-AUC: 0.88 β€” strong performance on imbalanced data
  • F1 Score: 0.8125
  • Precision: 0.7558
  • Recall: 0.8784 β€” high capture rate of known planned givers
  • 1,019 new high-potential donor predictions for targeted outreach

Technical Implementation

Data Pipeline

  • Donor data extracted from CRM into Snowflake
  • Modular Python scripts for feature engineering and cleaning
  • SMOTE oversampling to address class imbalance

Machine Learning

  • Random Forest classifier with scikit-learn
  • Stratified cross-validation and grid search
  • Multiple imputation strategies (MICE, mean, median)
  • Key temporal features (e.g., time since last gift)

πŸ“‚ Training Script
πŸ““ Evaluation Notebook

Model Performance Insights

Post-modeling analysis validated predictions against known donor engagement indicators:

  • 66.3% of predicted donors were already flagged as prospects by fundraisers
  • 37.6% are major donor households
  • 18% are members of the Mather Legacy Society

Top 5 Most Important Features

  1. Highest Previous Contribution (22.8%)
  2. Most Recent Contribution (20.1%)
  3. Years Since HPC Gift (14.6%)
  4. Total Amount (14.3%)
  5. Years Since MRC Gift (11.2%)

Demographics of Predicted Donors

  • Average age: 69
  • Giving history: 16 years (on average)
  • Median total giving: $10,932
  • Average number of transactions: 18

Tools and Technologies

  • scikit-learn, pandas, numpy
  • Snowflake
  • imbalanced-learn, matplotlib, seaborn

Repository Structure

β”œβ”€β”€ model/                      
β”‚   β”œβ”€β”€ split_SMOTE_crossval.py        # ML model executed on Snowflake
β”‚   └── snowflake_model_evaluation.py  # Model evaluation and visualization
β”œβ”€β”€ predictions_analyzed/              # Post-modeling analysis
β”‚   β”œβ”€β”€ predictions_analyzed.ipynb     # Model concurrence evaluation
β”œβ”€β”€ requirements.txt
└── README.md

Potential Future Improvements

  • Schedule automated data refresh and model retraining
  • Incorporate additional feature engineering
  • Develop dashboard for tracking model performance

Note: Full project repository: GitHub – dbouquin/bequest_modeling

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support