GithubHelp home page GithubHelp logo

employee-attrition's Introduction

Employee Attrition Prediction

Employee attrition prediction is a valuable tool for management. It helps in predicting the chances of an employee leaving the company and preparing accordingly to ensure a smooth transition or retention of employees. Key factors involved in employee attrition, such as salary, age, work experience, can be predicted using data available from the Human Resources Management (HRM) team.

In this project, we work with a dataset of employees from IBM and aim to predict if an employee will leave the company.

Objective

The objective of this project is to deeply analyze the available data, clean it, and build a model to predict employee attrition.

The Project

Database Description and Preprocessing

The dataset used for this project consists of 35 factors recorded for 1470 employees from IBM. The data is abstract, and some preprocessing was required. We encoded categorical data like gender, department, marital status using Label Encoder. We dropped columns like employee count, standard hours, and over 18, as they had the same values across all rows.

Model Selection

Based on the given case, we selected four models: Logistic Regression, Decision Tree Classifier, K-Nearest Neighbor, and Random Forest Classifier. We performed cross-validation, random undersampling, and random oversampling to determine which model performs best for the given dataset. Using metrics like mean score, z-score, and recall score, we determined that Logistic Regression is the best performing model.

Understandably, Logistic Regression stands out as the best-performing model, as it fulfills the assumptions made by the data, such as a binary target variable, independent observations, sample size, number of outliers, etc.

Model Preparation and Results

With the model decided, we conducted Exploratory Data Analysis (EDA) to identify the factors that have the most impact on employee attrition. We used a correlation matrix to determine the factors with high or low impact on attrition. We then individually visualized these impactful factors to gain a better understanding of their relationship with attrition. We selected the most impactful factors for modeling.

The dataset was randomly split into 70% for training and 30% for testing. We trained and tested our model and calculated its performance using a confusion matrix. We achieved a training accuracy of 88.9% and a testing accuracy of 86.8%.

employee-attrition's People

Contributors

l-k-1-2 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.