Please clone this repo to receive all the assets you need to work on this take home test.
The src
directory contains:
- A few sheets files containing survey data
- A general data spreadsheet
- Clock in and out timings
A large Telecommunications company, employs around 4000 employees. Every year, around 15% of their employees leave and need to be replaced. The management believes this attrition level to be a problem for the following reasons:
- Project delays leading to a reputation loss amongst consumers and partners
- Big amount of resources is needed to maintain a large recruitment department
- Work Productivity/Effectiveness is reduced due to the onboarding period for new staff
Hence, they contracted a workplace engineering & analytics firm to understand what factors are contributing to the high attrition, and what changes they should make to their workplace to support better retention. In addition, given limited resources, the company would like to know which variable is the most important and needs to be addressed straight away.
Since you are the super star data scientist on the workplace engineering team, you’ve been tasked with the project.
- You are required to model the probability of attrition
- You will present the outcome and your recommendations to the (hypothetical) senior management team to help them understand what changes they should make to their workplace, in order to reduce the current attrition rate.
- As a data strategist, are there any recommendations you would make to improve the efficiency of our data collection/analysis process?
- Create a private GitHub repository with the code (please do not share your solution publicly).
- Give the following users read access to your private repository: pranampartab, fanguman, eoc7, sarastegemoller
- Provide precise instructions for deploying the solution and replicating your results.
- Submit a document with the repository link to the URL at the bottom of this email.
Your solution will not be used or copied in any of our products. This exercise is for interview purposes only. We look forward to receiving your submission!
You will find 6 additional files attached to this project:
Five datasets: Employee survey data, general data, manager survey data, in_time, out_time. All the variables are defined and explained in the “data dictionary” sheet.