GithubHelp home page GithubHelp logo

gilaniasher / kaggle-house-regression-challenge Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 3.43 MB

Predicting housing prices in Iowa using Python/Pandas/linear regression within SKLearn.

Jupyter Notebook 100.00%
sklearn python pandas machine-learning automated-feature-engineering ordinary-least-squares kaggle

kaggle-house-regression-challenge's Introduction

Kaggle House Regression Challenge

This is the "Pandas Express" submission for the Kaggle House Prices: Advanced Regression Techniques Challenge as part of the BMGT438A data science class. Note that the code in this repository relies on data provided by Kaggle which has been removed from the repository's history. Please visit Kaggle to see this dataset.

Final Video Presentation

Final presentation including analysis can be viewed here:

BMGT438A Final Video Presentation

Final Poster

Methodology

  1. Clean up the data by turning all of the categorical columns (e.g Neighborhood) into a format an ML model can read using pd.get_dummies()
  2. Create an initial OLS (Ordinary Least Squares) model to see its r^2 value and see whether there are any other problems with the data
  3. Make use of SKlearn's automated feature selection package by using RFECV (Recursive Feature Selection with Cross Validation) to recursively determine the number of features to use in the final model as well as what those features are
  4. Explore SKlearn's Univariate Automated Feature Selection to see if it performs better than RFECV
  5. Build the final model and analyze the residuals to look for outliers and see if there are any patterns in the model's inaccuracies
  6. Run the final model on the test dataset to predict prices needed for the final Kaggle submission

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.