GithubHelp home page GithubHelp logo

loan-prediction's Introduction

Loan Prediction

Predict whether or not loans acquired by Fannie Mae will go into foreclosure. Fannie Mae acquires loans from other lenders as a way of inducing them to lend more. Fannie Mae releases data on the loans it has acquired and their performance afterwards here.

Installation

Download the data

  • Clone this repo to your computer.
  • Get into the folder using cd loan-prediction.
  • Run mkdir data.
  • Switch into the data directory using cd data.
  • Download the data files from Fannie Mae into the data directory.
    • You can find the data here.
    • You'll need to register with Fannie Mae to download the data.
    • It's recommended to download all the data from 2012 Q1 to present.
  • Extract all of the .zip files you downloaded.
    • On OSX, you can run find ./ -name \*.zip -exec unzip {} \;.
    • At the end, you should have a bunch of text files called Acquisition_YQX.txt, and Performance_YQX.txt, where Y is a year, and X is a number from 1 to 4.
  • Remove all the zip files by running rm *.zip.
  • Switch back into the loan-prediction directory using cd ...

Install the requirements

  • Install the requirements using pip install -r requirements.txt.
    • Make sure you use Python 3.
    • You may want to use a virtual environment for this.

Usage

  • Run mkdir processed to create a directory for our processed datasets.
  • Run python assemble.py to combine the Acquisition and Performance datasets.
    • This will create Acquisition.txt and Performance.txt in the processed folder.
  • Run python annotate.py.
    • This will create training data from Acquisition.txt and Performance.txt.
    • It will add a file called train.csv to the processed folder.
  • Run python predict.py.
    • This will run cross validation across the training set, and print the accuracy score.

Extending this

If you want to extend this work, here are a few places to start:

  • Generate more features in annotate.py.
  • Switch algorithms in predict.py.
  • Add in a way to make predictions on future data.
  • Try seeing if you can predict if a bank should have issued the loan.
    • Remove any columns from train that the bank wouldn't have known at the time of issuing the loan.
      • Some columns are known when Fannie Mae bought the loan, but not before
    • Make predictions.
  • Explore seeing if you can predict columns other than foreclosure_status.
    • Can you predict how much the property will be worth at sale time?
  • Explore the nuances between performance updates.
    • Can you predict how many times the borrower will be late on payments?
    • Can you map out the typical loan lifecycle?

loan-prediction's People

Contributors

vikparuchuri avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.