GithubHelp home page GithubHelp logo

predictive-modeling's Introduction

predictive Modeling Course - Statistical and data science foundations

Intro

This course is designed to build foundations for predictive modeling. Many available resourses are either statistical/theoretical in nature or data science programming focused. After one has gone through learnings, some key questions remains:

  1. how model is set up in real life use cases?
  2. why a particular set up such as y=f(x) has predictive powers?
  3. how to interpret predictive power metrics such as R-square, partial-R-square, KS, AUC, recall/precision, etc?
  4. how to build/deploy?

In short, this course offers to close the gap between learnings and practicing.

chapter 1: set up

Setting up vscode(local dev) /python / github(code repo) ... Building data set needed for modeling ...

chapter 2: what's predictive modeling

a. definition: find signal to some `future' outcome. Key is future. ... b. a time series example for a single entity such a stock ticker ... c. multiple time series (identical independent distributed entities) example ... d. what does the usual y=f(x) setup entail? ... e. t is everything: difference amongst y_t+1 = f(x_t) vs y_t =f(x_t) vs y_t = f_t(x_t) ... f. causality vs statistical relationship. Only certain statistical relationship will be considered 'predictive' ...

chapter 3: common predictive model setup --- gain intuition through penciled examples

a. OLS ... b. simple regression ... c. logistic regression ... d. simple tree ... e. network ...

chapter 4: building models

a. design x, y, splits (train, validation, test sets) ... b. feature engineering ... c. feature selection ... d. model selection ... e. score cards and model objects ...

chapter 5: evaluating model

a. predictive power metrics ... b. cross time validation ... c. bias-variance tradeoff ... d. underfitting/overfitting ... e. leaking ...

chapter 6: deployment -- batch or near real time

a. using score cards with database ... b. using model object with python ... c. end point using sagemaker ...

chapter 7: deep dive into some key issues on these predictive models

a. skewed data in y ... b. skewed data in x ... c. boosting vs bootstrapping ... d. drifting and time travel ... e. incrementality or controlable model impact ... f. cicd

predictive-modeling's People

Contributors

liuyunliu2000 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.