GithubHelp home page GithubHelp logo

customer-churn-in-telecom's Introduction

Predicting Customer Churn in Telecommunication using Data Mining Techniques

Dataset Source :

https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/11/telco-customer-churn-1113

Dataset Description:

*Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

*The data set includes information about:

*Customers who left within the last month – the column is called Churn

*Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies

*Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges

*Demographic info about customers – gender, age range, and if they have partners and dependents

Inspiration

To explore this type of models and learn more about the subject.

Project Description

All the standard steps of Knowledge Discovery that is Selection, Preprocessing, Transformation, Training and Evaluation have been used in this project. First the standard Pyhton Machine Learning libraries are imported eg. numpy, pandas, etc. The dataset is stored as a pandas DataFrame reffered by variable dataset.

Data Cleaning

CustomerID column is droped as it does not contribute to the result of Churn.

DataFrame currently has 1 float, 2 int and 17 object type. In order to implement the Machine Learning Models we need to have all the features as numeric datatype.

Data Visualisation

All the features are plotted against Churn to find interesting patterns int the dataset. for eg. Customers Having Internet Service of type fibre optics are more likely to Churn than Customers with No Internet Service.

Data Preprocessing

First the data type of Total Charges feature column is changed from objecct to float type. Then the null and NaN values are detecting in the dataset. There were 10 NULL values in Total Charges Column so we use Simple Imputer Class of Sckit Learn Package to impute the NULL values with mean of the remaining data from the column Total Charges. This efectively takes cake of the NULL values.

Encoding Categorical Data

Binary features are encoded using Label Encoding that is Yes = 1 and No = 0 etc. Nominal Features that features having more that 2 categorical values are encoded using One Hot Encoding Method. Finally we get 41 columns in feature matrix.

Train Test Split

The dataset is splitted into 4 numpy arrays x_train, y_train, x_test, y_test where Training arrays have 80% of the total data present in the dataset and rest 20% is used for testing.

Training the Model

The Model is then trained on differnt Machine Learning Models to predict the class of the Churn depending upon the input. We save the predicted data in a numpy array y_pred.

Testing the Model

The y_pred is compared with the y_test array and an accuracy is estimated based on this comparision.

K-Fold Cross Validation

Cross validation od the training accuracies is performed and their mean is calculated as well as Standard Deviation between accuracies is calculated.

Hyperparameters

All the parameters of the attributes are tested to get the best model parameters.

Visualising Results

Yet to be Completed!

customer-churn-in-telecom's People

Contributors

sandip-sk avatar

Watchers

Neeraj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.