- nltk 3.3.0
- numpy 1.15.2
- pandas 0.23.4
- scikit-learn 0.20.0
- sqlalchemy 1.2.12
In this project, It will provide disaster responses to analyze data from Figure Eight to build a model for an API that classifies disaster messages.
This project will include a web app where an emergency worker can input a new message and get classification results in several categories. The web app will also display visualizations of the data.
Below are a few screenshots of the web app.
- ETL Pipeline Preparation.ipynb: Description for workspace/data/process_data.py
- ML Pipeline Preparation.ipynb: Description for workspace/model/train_classifier.py
- workspace/data/process_data.py: A data cleaning pipeline that:
- Loads the messages and categories datasets
- Merges the two datasets
- Cleans the data
- Stores it in a SQLite database
- workspace/model/train_classifier.py: A machine learning pipeline that:
- Loads data from the SQLite database
- Splits the dataset into training and test sets
- Builds a text processing and machine learning pipeline
- Trains and tunes a model using GridSearchCV
- Outputs results on the test set
- Exports the final model as a pickle file
I wish to thank Figure Eight for dataset, and thank Udacity for advice and review.