nba / README.md
Cal Mitchell
changed readme
d81c324
metadata
license: other
language:
  - en
tags:
  - nba
  - basketball

NBA Predictions

This repo contains AI model code and weights which predicts the outcome of NBA games. Its output represents the chance that given point spreads will occur.

Intro

The model requires 8 players on the home and away teams, plus their ages, as input. It will then output probabilities for each point spread between -20 and +20 points, from the home team's point of view.

For example, the following text and chart represents the model's opinion on the Boston Celtics vs the Denver Nuggets. A matchup I am personally terrified of as a Celtics fan.

Let's start with both teams at pretty much full strength, with the Celtics at home. In this example, the model predicts the celtics to win around 3 in every 4 games, with a 14% chance of the Celtics winning by 20 or more.

Full strength Celtics vs full strength Denver. Celtics at home.

Let's flip the location and see what the model thinks would happen if the Celtics had to travel to Denver. Interestingly, the Model now favors Denver to win with 55% confidence.

Full strength Celtics vs full strength Denver. Denver at home.

Now here's the really fun part - mixing and matching players. Most people would say Jokic is the best player in the league at the time of writing, and Tatum is a notch below him. A lot of people would also say that the Celtics are an incredibly deep team, as far as their starters are concerned, while the Nuggets are a bit more reliant on their top stars.

All of this is to say that taking Jokic off the Nuggets should have more of an effect than taking Tatum off the Celtics. The chart below shows Denver at home, without Jokic in the lineup. He has been replaced by Peyton Watson. As you can see, Denver's win percentage dropped by 13%.

Celtics ful strength vs Denver without Jokic. Denver at home.

Let's keep the game in Denver, put the Nuggets back at full strength, and replace Tatum with Pritchard. As you can see, the Nuggets are now projected to win 66% of the time. That sounds about right to me!

Celtics without Tatum vs full strength Denver. Celtics at home.

Installation

I recommend installing Python 3.11.8, as that is what the repo was written / tested in. The code will likely work with most recent versions of Python, though.

Once you have Python installed, run pip install -r requirements.txt. It will take a while to install dependencies if you don't already have PyTorch cached.

Usage

The example.ipynb notebook shows how to use the model to predict the final game of the 2023-24 NBA season - a game between the Dallas Mavericks and Boston Celtics. It will output the chart above.

To change the players and their ages, you must reference the player_tokens.csv and age_tokens.csv files.

For example, if you wanted to subtract Kristaps Porzingis from Boston's team, you would take the token representing Porzingis 4416 out of the home_team_tokens list, and replace him with, say, Payton Pritchard 4999. You would then have to look up Pritchard's age (26), find the corresponding age token in age_tokens.csv, which is 11, and replace Porzingis' age token (which is the second to last token).

To swap home and away, you could replace the variables containing all of the player and age tokens, or just set the swap_home_away variable to True.

Training Process

I downloaded data from stats.nba.com using the https://github.com/swar/nba_api package to get information on minutes played, game outcomes, and a few other dimensional elements to make everything fit together. Then, I ran a custom PyTorch training loop to train the model(s) on their chosen loss objective (spread, money line, or spread probability).

I then used code roughly based on NangoGPT and TorchTune to train the model.