GithubHelp home page GithubHelp logo

prediction_api's Introduction

Prediction_API

1. Purpose and project objective

Purpose

  • To develop API that capitalizes on real-estate data to render the following functionalities :
    1. modeling a house in 3D from lidar satellite images (geoTIFFs file) by only entering a home address. This part is an extension of a previous project
    2. locating the house on a map by entering its address
    3. making price forecast on the buildings (i.e. houses or apartment) according to multiple features (postal code, number of rooms, living space, surface area, etc.)
  • Te deploy the API on azure (using a.o. Docker and Travis)

Objectives

  • Consolidate the knowledge in Python, specifically in : NumPy, Pandas, Sklearn, Matplotlib,...
  • To be able to search and implement new librairies
  • Consolidate knowledge of data science and machine learning algorithm for developping an accurate regression prediction model
  • To be able to construct the project with object-oriented programming (OOP)
  • To be able to implement the whole project - and make it functioning - through an API (using Flask)
  • To be able to deploy the API on a web based environment (in this case Azure)

Features

Must-have

  • The API must be functional
  • Your model must be functional

Nice-to-Have

  • The API to be deployed on a web based environment (e.g. Heroku, Azure, etc.)
  • Optimize your solution to have the result as fast as possible.
  • The API searches for as much information as possible on its own. (For example, area => cadastre) Better visualization
  • You provide a 3d representation of the house

Context of the project

  • All the work achieved was done during the BeCode's AI/data science bootcamp 2020-2021

2. The project

Working plan and steps

1. Research

  • Research and understand the term, concept and requirement of the project.
  • Discover new libraries that can serve the project purposes
  • Developing, using and testing machine learning algorithm (i.a. sklearn with linear, SVG, decision trees regression, XGBoost,...)

2. Data collection

3. Data manipulation

  • Data cleaning : including, a.o., removing outliers and features with to many missing values (>15%) and conducting multivariate feature imputation for the feature with less missing values (using sklearn.impute.IterativeImputer)

  • Features engineering : as location (postal code) are not readily amenable to be integrate in quantitative model - but has nonetheless a huge impact on real-estate price - a ranking index was compute based on the average house price for each entities in Belgium. As shown, this index demonstrates a good association with house prices and it seemed that its 3rd polynomials best explained the target (more than 25% of the 'house price' variance explained for this sole feature - based on r_square coefficient).

4. Modelization

  • Features :
    • type of building: house/apartment
    • living area: square meters
    • field's surface: square meters
    • number of facades
    • number of bedrooms
    • garden: yes/no
    • terrace: yes/no
    • terrace area: square meters
    • equipped kitchen: yes/no
    • fireplace: yes/no
    • swimming pool: yes/no
    • state of the building: as new, just renovated, good, to refresh, to renovate, to restore (one hot encoding)
  • Target:
    • House price: euros
  • Machine learning model:
    • Multiple models using increasing number of features and based on various algorithm (i.a. linear, SVM, decision tree, XGBoost) were trained and evaluated.
    • The best model was based on the XGBoost algorithm (n_estimators=700, max_depth= 4, learning_rate= 0.3) and provided an r_square coefficient of .82 on the train set and of .76 on the test set
    • The best fitted model was save as a pickel file which was integrated in the API for price estimation
    • Examples of python code for data manipulation and algorithms development are stored in the notebook folder of the current repository

Project output

1. API Structure

2. API Routes

  • Estimate: in

  • Estimate: out

  • Map: in

  • Map: out

  • 3D reconstruction: in

  • 3D reconstruction: out

prediction_api's People

Contributors

jcmeunier77 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.