Find the patterns in sales wins and losses. Understand your sales pipeline and uncover what can lead to successful sales opportunities and better anticipate performance gaps.
We will use Pandas to read a data set from IBM's Watson repository and in Python, a database isn't the simplest solution for storing a bunch of structured data.
Then we’ll dive into scikit-learn and use preprocessing.LabelEncoder() in scikit-learn to process the data, and train_test_split() to split the data set into test and train samples. We will also use a cheat sheet to help us decide which algorithms to use for the data set. Finally we will use three different algorithms (Naive-Bayes, LinearSVC, K-Neighbors Classifier) to make predictions and compare their performance using methods like accuracy_score() provided by the scikit-learn library. We will also visualize the performance score of different models using scikit-learn and Yellowbrick visualization.