The project involves two parts, developing code for an image classifier using PyTorch and then conversion of the code into a command line Python application. vgg16 and densenet121 neural network model have been used for this project.
The first part of the project involves implementation of an image classifier through a Jupyter notebook using PyTorch.
For detailed project specifications please refer the Jupyter notebook Image Classifier Project.ipynb
- Package Imports: All the necessary packages and modules are imported in the first cell of the notebook
- Training data augmentation: torchvision transforms are used to augment the training data with random scaling, rotations, mirroring, and/or cropping
- Data normalization: The training, validation, and testing data is appropriately cropped and normalized
- Data batching: The data for each set is loaded with torchvision's DataLoader
- Data loading: The data for each set (train, validation, test) is loaded with torchvision's ImageFolder
- Pretrained Network: A pretrained network such as VGG16 is loaded from torchvision.models and the parameters are frozen
- Feedforward Classifier: A new feedforward network is defined for use as a classifier using the features as input
- Training the network: The parameters of the feedforward classifier are appropriately trained, while the parameters of the feature network are left static
- Testing Accuracy: The network's accuracy is measured on the test data
- Validation Loss and Accuracy: During training, the validation loss and accuracy are displayed
- Loading checkpoints: There is a function that successfully loads a checkpoint and rebuilds the model
- Saving the model: The trained model is saved as a checkpoint along with associated hyperparameters and the class_to_idx dictionary
- Image Processing: The process_image function successfully converts a PIL image into an object that can be used as input to a trained model
- Class Prediction: The predict function successfully takes the path to an image and a checkpoint, then returns the top K most probably classes for that image
- Sanity Checking with matplotlib: A matplotlib figure is created displaying an image and its associated top 5 most probable classes with actual flower names
Run the Jupyter notebook in GPU enabled mode to build the model and use it for prediction
Now that the deep neural network model is built and trained on the flower data set, it's time to convert it into an application so that others can use it for prediction. The built application is a pair of Python scripts that run from the command line. For testing, the model checkpoint will be used that has been generated and saved in the first part of the project.
The project submission includes two files train.py and predict.py. The first file, train.py, trains a new network on a dataset and save the model as a checkpoint. The second file, predict.py, uses a trained network to predict the class for an input image. A separate file model.py
has been created for functions and classes relating to the model and another file utility.py
for utility functions like loading data and preprocessing images.
- Training a network: train.py successfully trains a new network on a dataset of images and saves the model to a checkpoint
- Training validation log: The training loss, validation loss, and validation accuracy are printed out as a network trains
- Model architecture: The training script allows users to choose from at least two different architectures available from torchvision.models
- Model hyperparameters: The training script allows users to set hyperparameters for learning rate, number of hidden units, and training epochs
- Training with GPU: The training script allows users to choose training the model on a GPU
- Predicting classes: The predict.py script successfully reads in an image and a checkpoint then prints the most likely image class and it's associated probability
- Top K classes: The predict.py script allows users to print out the top K classes along with associated probabilities
- Displaying class names: The predict.py script allows users to load a JSON file that maps the class values to other category names
- Predicting with GPU: The predict.py script allows users to use the GPU to calculate the predictions
-
Train Train a new network on a data set with
train.py
- Basic usage:
python train.py --data_dir data_directory
- Prints out training loss, validation loss, and validation accuracy as the network trains
- Options:
- Set directory to save checkpoints:
python train.py --data_dir data_directory --save_dir save_directory
- Choose architecture:
python train.py --data_dir data_directory --arch "vgg16"
- Set hyperparameters:
python train.py --data_dir data_directory --learning_rate 0.01 --hidden_units 512 --epochs 20
- Use GPU for training:
python train.py --data_dir data_directory --device cuda
- Sample command:
python train.py --data_dir flowers --save_dir checkpoint.pth --arch vgg16 --learning_rate 0.005 --hidden_units 512 --dropout 0.05 --epochs 4
- Set directory to save checkpoints:
- Basic usage:
-
Predict Predict flower name from an image with predict.py along with the probability of that name. That is, you'll pass in a single image /path/to/image and return the flower name and class probability.
- Basic usage:
python predict.py --image_dir /path/to/image --model_input checkpoint
- Options:
- Return top K most likely classes:
python predict.py --image_dir /path/to/image --model_input checkpoint --top_k 3
- Use a mapping of categories to real names:
python predict.py --image_dir /path/to/image --model_input checkpoint --category_names cat_to_name.json
- Use GPU for inference:
python predict.py --image_dir /path/to/image --model_input checkpoint --device cuda
- Sample command:
python predict.py --image_dir ./flowers/test/74/image_01191.jpg --model_input checkpoint.pth --top_k 3 --category_names cat_to_name.json --device cuda
- Return top K most likely classes:
Note: argparse module in the python standard library has been used to get the command line input into the scripts.
- Basic usage: