This project outlines the needed requirements of the project for STAT 486: Machine Learning. A brief description of the respository is as follows:
Lists the data used for analysis. We had a separate folder called FE (Feature Engineering) that has the data used for the model fitting (standard dataframe, PCA, and Lasso Variable Selection).
Handles the building, cleaning, EDA, and feature engineering of the data for the project. Steps were completed in the number each script represents.
Graphs from EDA used
Fits the predictions of each model to the number of crimes for the entire dataset. This was used as a visual cue to see how well models did at fitting to its own data.
Outlines the numerous models utilized and tested. We used six different models for our project:
- ACP
- KNN
- Lasso
- Logistic Regression
- Poisson Regression
- Linear Regression
- Random Forest
- SARIMA (time series)