GithubHelp home page GithubHelp logo

1stscience / ml_random_forest_house_price_predictor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zhenyu92/ml_random_forest_house_price_predictor

0.0 0.0 0.0 911 KB

An End-to-end ML pipeline on the study of a market historical of real estate valuation collected from Sindian district, New Taipei City, Taiwan.

Jupyter Notebook 99.05% Shell 0.03% Python 0.92%

ml_random_forest_house_price_predictor's Introduction

House Price Predictor for Sindian District

This project is a part of the evaluation of my application to AIAP.

Project Status: [Completed]

Project Objective

The study is on a market historical of real estate valuation collected from Sindian district, New Taipei City, Taiwan. Dataset is crawled from Here.

Environment Prerequisites

Python3, pip3 & curl packages are required. To install, type the following in terminal:

$ sudo apt install python3 python3-pip curl

Instruction

Run the executable bash script named run.sh at the base folder.

bash run.sh

The script will:

  • Install prerequisites library as stated in requirements.txt.
    numpy
    panda
    matplotlib
    sklearn
    seaborn
    
  • Download the dataset as real_estate.csv.
  • Run AIAP.py, a Python script which import real_estate.csv and perform Machine Learning to train a regression model.

ML Methods Covered

There are two ML methods covered in this study:

  • Linear Regression
  • Random Forest Regresion To select, key in the corresponding index:
Which model do you want to use? [1] Linear Regression, [2] Random Forest

Exploratory Data Analysis

Detailed explanation can be found in the IPython File.

  • 1 Importing relevant libraries
  • 2 Loading raw data
  • 3 Preprocessing
    • 3.1 Exploring the descriptive statistics of the variables
      • 3.2.1 Handling categorical variable
    • 3.2 Dealing with missing values
    • 3.3 Looking for correlation
    • 3.4 Exploring the PDFs
      • 3.4.1 Exploring variables (X1 - X6, Y)
    • 3.5 Dealing with outliers
    • 3.6 Log transformation
  • 4 Prepare data for ML and create test set
    • 4.1 Declare inputs and targets
    • 4.2 Data scaling
    • 4.3 Train test split
  • 5 Select and train a model
    • 5.1 Linear regression model
    • 5.2 Random forest regressor
  • 6 Apply model on test set
    • 6.1 Test with linear regressor
    • 6.2 Test with RF regressor
      • 6.2.1 Grid Search (fine-tune)
      • 6.2.2 Apply model to test set

ml_random_forest_house_price_predictor's People

Contributors

zhenyu92 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.