Applied 2 ML algorithms on 3 datasets using python and R made various comparisions such as: 1. Identification of Identically Named ML Techniques 2. Identifying the Better Performing Technique 3. Identification of Dataset Characteristics which May Impact the Performance of ML Techniques.
Datasets:
- Bank Marketing Dataset: https://www.kaggle.com/datasets/janiobachmann/bank-marketing-dataset/data
- Lead Scoring Dataset: https://www.kaggle.com/datasets/amritachatterjee09/lead-scoring-dataset/data
- Adult Census Income : https://www.kaggle.com/datasets/uciml/adult-census-income
Machine LearningAlgorithms Used: The above datasets contain a mix of numerical features and categorical features, which shows the existence of potential non-linear relationships between the features. Hence to handle these conditions and to get the complex patterns KNN classification is suitable. Additionally, the datasets are robust, and there are multiple features. So, to handle this robustness and high dimensionality random forest classification is a great option.