GithubHelp home page GithubHelp logo

himanshuxd / aushousingregressionanalysis Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 2.02 MB

A comprehensive regression analysis of Australian housing market data to predict property values for strategic investment decisions.

License: MIT License

Jupyter Notebook 100.00%

aushousingregressionanalysis's Introduction

Housing Regression Analysis

This repository presents an in-depth regression analysis of the Australian housing market. Surprise Housing, a US-based company, is strategically entering the Australian real estate landscape. Through meticulous data analytics, our objective is to predict property values accurately. This analysis empowers Surprise Housing to make informed investment decisions, ensuring the acquisition of properties below market value for subsequent profitable resale.

Table of Contents

Project Overview

This repository focuses on a regression analysis using the train.csv dataset to predict property values in the Australian housing market for Surprise Housing, a US-based real estate company entering Australia. A guide to the data structure present is give in data_description.txt, by leveraging statistical modeling, the project aims to equip Surprise Housing with data-driven strategies to make informed decisions, ensuring successful market entry and profitable property transactions in Australia.

Approach

We begin with comprehensive data preprocessing, including handling missing values and categorical encoding. Exploratory Data Analysis (EDA) delves into the dataset's nuances, providing insights that inform feature engineering decisions. The code then implements Lasso and Ridge Regression models for predictive analytics. To ensure robust evaluation, metrics like RMSE and R-squared are employed.

Conclusions

  • Variables such as OverallQual, GrLivArea, TotalBsmtSF, and YearBuilt significantly influence housing prices positively.
  • The neighborhood (Neighborhood) plays a crucial role in determining housing prices, contributing to variations in overall housing prices.
  • Property age (PropAge) negatively impacts housing prices, with newer properties commanding higher prices.
  • Both Lasso and Ridge Regression models demonstrated good predictive performance, with R-squared values around 0.92, capturing a significant portion of the variance in housing prices.
  • The analysis provides valuable insights for the real estate sector, guiding decision-making based on features like overall quality, living area, neighborhood, and property age to maximize property values.

Business Insights:

  1. Emphasize property quality and size, as features like OverallQual and GrLivArea significantly impact housing prices.
  2. Choose neighborhoods strategically, considering the positive influence of certain areas on overall housing prices.
  3. Prioritize newer properties for investment, aligning with the market trend favoring modern constructions.

Technologies Used

  • Python: The primary programming language for data manipulation, analysis, and visualization.
  • NumPy: Used for numerical operations and efficient array handling.
  • Pandas: Employed for data manipulation, including data cleaning and preprocessing.
  • Matplotlib and Seaborn: Visualization libraries for creating insightful plots and charts.
  • Scikit-Learn: Utilized for machine learning tasks, including model building, feature selection, and data scaling.
  • Statsmodels: Leveraged for advanced statistical modeling and analysis, particularly for linear regression and VIF calculations.
  • Jupyter Notebook: The interactive environment for executing code, documenting the analysis, and presenting results.
  • Min-Max Scaler: A feature scaling technique from Scikit-Learn for normalizing numerical data.
  • LassoCV: Implemented for Lasso Regression with cross-validated alpha selection.
  • Ridge: Applied for Ridge Regression.
  • Imputer: Used to fill missing values in selected features, ensuring robustness in data preprocessing.

License

This project is licensed under the MIT License.

[1] Himanshu S, "Housing Regression Analysis : Ridge and Lasso Regression Approach (2024)" @himanshuxd

Acknowledgments

  • This project is done for US-based housing company named Surprise Housing trying to penetrate Australian market.
  • This project was done as a part of a project for LJMU and IIIT Bangalore's Masters in Machine Learning & Artificial Intelligence program.

Contact

  • For inquiries or collaborations, feel free to reach out on GitHub: @himanshuxd

aushousingregressionanalysis's People

Contributors

himanshuxd avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.