GithubHelp home page GithubHelp logo

syde522-assignments's Introduction

syde522-assignments

A neural network based approach was used in order to classify the dataset. This was done using the pytorch automatic differentiation library and several differing convolutional models. The training loop is loosely based on a pytorch tutorial, though was modified for custom data loading and logging. The models that were tested included a model loosely based on a resnet-18, having a similar residual structure, as well as the model described in the yolo-v2 paper (implementation from: https://github.com/marvis/pytorch-yolo2/blob/master/models/tiny_yolo.py). These models were selected due to their excellent representational abilities, with the yolo model being selected due to it having significantly fewer parameters. These models are not properly sized for the data-set and have significantly more capacity than required, but the interest was to see how much data augmentation can compensate for this. To perform this augmenetation, the dataloader randomly flipped the image, randomly selected 160x160 crops from the image and then normalized. Cropping was selected based on the observation that the images seemed to have repeating patterns, and these subimages do not influence class identity. The presence of these transforms greatly increases the effective size of the dataset. The training hyper parameters (lr and small changes in the models) were selected through expeimentation. For validation, the same procedure was followed, but instead of randomly selecting crops, 10 were selected from throughout the image and the result of this was averaged to generate a prediction. The results were validated using K-fold cross validation with 3 folds generally this resulted in about 94-97% validation accuracy, across the models being tested. The model with the best validation accuracy over 128 epochs was saved from each of the cross validation runs. Additional networks were created by running over the cross validation loop multiple times. Due to a GPU memory leak, a maximum of about 20 models was retained. All of the models were then evaluated over the test data, keeping track of their respective classifications. A voting scheme was used to ensemble the models, selecting the class with the most votes, or the class with the lower index in the case of ties. To improve scores, a real test set should be created to help tune. Moreover, the model size should be significantly reduced, overall this should help reduce the possibility of overfitting and improve generalization. Further data augmentation can be used such as image rotation (properly implemented, what was tested had artifacts). The submission with the best performance (96.666% on test set) was with the yolo network, 128 epoch, 3 folds and 7 extra iterations.

(Submission for Kaggle: https://www.kaggle.com/c/digitalpathology/leaderboard)

syde522-assignments's People

Contributors

danieldworakowski avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.