GithubHelp home page GithubHelp logo

qw2243c / spring2018-project5-grp7 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tzstatsads/spring2018-project5-grp7

0.0 0.0 0.0 24.31 MB

Spring2018-Project5-grp_7 created by GitHub Classroom

R 3.57% CSS 0.76% Jupyter Notebook 34.32% Python 0.43% HTML 60.91%

spring2018-project5-grp7's Introduction

ADS Project 5 : What shall we cook tonight ?

image Term: Spring 2018

  • Team # 7

  • Project Title : What shall we cook tonight?

    → Our App Click Here!

  • Team members

    • Anshuma Chandak (ac4258)
    • Hanying Ji (hj2473)
    • Lan Wen (lw2773)
    • Qianhui Wan (qw2243)
    • Xueying Ding (xd2196)
  • Project summary: Food is an innate part of any culture or region. Every cuisine has some unique ingredients or some ingredients that are used in almost all of its dishes. If you visit Korea, the markets would be sprawled with kimchi and the smell of squids. The colourful and aromatic spice markets of India indicate the natural use of diverse spices in the Indian cooking. In this project, we predict the category of a dish's cuisine given a list of its ingredients. We are using Yummly's data which is arranged by cuisine, dish ID, and its ingredients. Aiming to combine and use the knowledge from other projects in this course, and build a product that has high functionality and usability, we have divided this project into two parts:

The first part is Algorithm Comparison. We started off with 6849 ingredients, and used the bag-of-words model to reduce the number of ingredients to 2000. Then use different algorithms (Random Forests, XGBoost, SVM, Logistic Regression, Decision Tree, KNN) to predict the category of a dish, and aim at improving the accuracy the prediction. Following is our algorithms summary, from which we can see that Logistic Regression outperforms others under this situation:

image

The second part is to build an R Shiny app for exploratory data analysis, as well as recommend the cuisine and other related cuisines given a set of a ingredients.

The following word clouds show the different cuisines, and the assortment of ingredients in our data set (the size reflects the number of recipes):

image image

We wish to apply a superior model to predict a specific cuisine given various different ingredients we put in. Image when it’s dinner time, you open the refrigerator and see there are kimchi, romaine lettuce inside. You wonder what to cook for tonight. So the shiny app could provide you a useful cooking recommendations given the ingredients you have. From the six algorithms we used in previous part, logistic regression model performs the best. For example, if you have kimchi and romaine lettuces as your inputs, you will get a predicted “Korean” cuisine result.

We also introduced Pearson Correlation algorithm to calculate the similarities between all 20 different cuisines in our dataset. So if you get a most relevant recommendation given the ingredients you have, the app could also provide other two most similar cuisines for your consideration.

image

Contribution statement: (default) All team members contributed equally in all stages of this project. All team members approve our work presented in this GitHub repository including this contributions statement.

  • Anshuma Chandak : Explored different topics, and suggested the current topic. Kept track of the project and set the timelines. Updated the Readme file (Write up & Wordclouds). Train & test XG Boost & Decision Tree models.
  • Hanying Ji : Mainly taking responsibility of building the R Shiny app for exploratory data analysis, which includes UI, serve and css style.
  • Qianhui Wan : Help design and built the framework of shiny app. integrate all parts in to Shiny app and the publish.
  • Lan Wen : Bag-of-words model, train/test data split, train and test Random Forest model, data cleaning and transformation for R Shiny logistic regression model and retrain logistic regression model in R (for Shiny part).
  • Xueying Ding : Applied SVM, KNN, Logistic model to classify.

References:

Following suggestions by RICH FITZJOHN (@richfitz). This folder is orgarnized as follows.

proj/
├── lib/
├── data/
├── doc/
├── figs/
└── output/

Please see each subfolder for a README file.

spring2018-project5-grp7's People

Contributors

hannahji avatar anshumachandak avatar lw2773 avatar qw2243c avatar xueyingding avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.