GithubHelp home page GithubHelp logo

nba-rankings-2024's Introduction

NBA-RANKINGS-2024

The model is a game-level classier that predicts the probability of either team winning the match-up based on historical box score stats. No assumptions are used to make predictions; the only judgment applied is the choices made for feature engineering and modeling framework.

What Does the Model Predict?

The model predicts the probability of either team in the match-up winning. It does not predict points scored.

How the Model Works

The model is based on a two-step procedure:

  • Create PCAs using on box score statistics and player height. The PCAs capture player characteristics and are based on normalized features, such as three-point tendency, share of rebounds, true shooting percentage, etc.

  • The PCAs are based on data leading up to the game. No data from the game itself is used, except for injury status.

  • Train a shallow xgBoost model using the PCAs and a home court indicator. The model is trained on data since the 2014-2015 season. One-game-ahead in-sample AUC IS 0.73 and the one-game-ahead out-of-sample AUC is 0.71.

  • Team performance is not included in the models. Predictions are purely based on roster composition and home-court advantage.

Benefits of this approach:

  • Data engineering is simple since it relies on box score stats.
  • This is a player level model โ€” which means that it can adapt to changes to rosters, injuries, and line-ups. No need to wait and see how the team does post roster changes.
  • We can also use it to simulate how roster changes can affect the outcome of a given match-up.

Downsides to this approach:

  • It relies purely on box score stats. No advanced stats based on play-by-play data are included.
  • The minute allocation algorithm is based on simple moving averages. In real life, coaches adapt to the opponent.

About the code

1 - read_and_transform_data: The step that reads the raw data from the hoopR package, creates the features, and organizes the data for training.

2 - train_model.R: Trains and validates the xgBoost model.

3 - plot.R Creates the scatter plot to analyze predicted versus actual team rankings.

nba-rankings-2024's People

Contributors

klarsen1 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.