GithubHelp home page GithubHelp logo

ds-projects's Introduction

Data Science Projects

Presentations about data science.

Supervised Learning Projects

  • Detecting_Implicit_Bias_in_Traffic_Stops by Mark Ferguson.

  • Lemons: Predicting whether a Vehicle will be kicked back to the auction by Will Morgan.

  • Predicting the success of cyber-related terrorist attacks by Rebecca Green.

  • Breast cancer survivor models by Rich Gohram.

  • Predicting Disruptive Children (Including visualization of PCA on binary variables) by Greg Condit

  • Predicting churning teleco customers by Eve Ben Ezra. Churn, or customer attrition, is the loss of customers. Churn is an area of interest for many industries, since it is often more expensive to bring in a new customer than to retain one. Using the popular Telco Customer Churn dataset from Kaggle, I hope to explore the data and determine which features might cause a customer to leave, and if a combination of features might make a customer "high risk" for leaving the company.

  • Santander Bank Customer Transaction Prediction by Fred Etter. Bank Santander is trying to predict if a customer will make a specific transaction in the future. Ananomyzed data was presented to Kaggle with 200,000 rows and 200 columns. Multiple supervised learning algorithms were tested and evaluated to determine the best method and produce an accuracy metric.

  • Safe Driver Prediction for Automobile Insurance by Murali Mandayam. Correctly classifying a driver during underwriting is an important aspect in automobile insurance. All the supervised learning algorithms I used classify a driver as a 1, to indicate safe driver, or 0, to indicate that the drivers' information needs a review prior to issuing a policy.

  • Digit Recognizer by Slava Sablin. A pretty straightforward approach to test some basic models and their combinations on a classic machine learning problem. The goal is to correctly identify digits from a MNIST ("Modified National Institute of Standards and Technology") dataset of tens of thousands of handwritten images.

  • Predicting Divorce by Helen Skinner. Is it possible to predict whether an individual has ever been divorced based on their demographic traits? This supervised learning project tests 5 different algorithms to find out.

  • Predicting Forest Fire Causes by Matt Francsis. Human-caused fires account for between 43 and 59% of all wildfires in the western US. While wildfires can be beneficial to the ecosystem, they also pose serious threats to lives, property, and infrastructure. Predicting the cause of forest fires can assist investigators bring arsonists to justice and act as a catalyst for fire abatement strategies. This talk will discuss supervised learning modeling techniques for this large, imbalanced, multi-class, problem.

Unsupervised learning report

  • Math lectures Part 1 Combine NLP with supervised and unsupervised learning to classify math lectures. By William Morgan.

Final capstone

  • Predicting Life Expectancy by Country by Trent Casillas. Using linear regression, mixed effect models, and clustering to predict and determine important factors for a country's life expectancy average.

  • Cover to Cover: A (not so) Novel Approach to Book Reccommendations by Mark Espina. The saying goes "Don't Judge a book by it's cover" But Why? Anyone who shops at a local bookstore is definitely paying attention to the covers. And from personal experience, it is a key determinant on whether I end up purchasing a book. First, I will discuss the pros and cons of applying Convolutional Neural Nets to Image Classification, attempting to predict genre labels. In the second half, I will be exploring the application of feature extraction with similarity models as the basis for an Image Content-based retrieval system, Cover-to-Cover.

  • Using machine learning to cluster and classify math lectures by Will Morgan. Using machine learning to cluster and classify math lectures.

  • Capstone_2016_us_elections by Emile Badran. In this capstone project, I process tweets from the leading Democratic (Hillary Clinton) and Republican (Donald Trump) candidates and key 2016 US election hashtags. I apply Natural Language Processing and Network Analysis techniques to find the key topics, and the most influential actors that have guided the public debate.

  • DNA Sequence detection with Genetically trained weights by Chistopher Sanchez

  • Assessing Gender Bias in Tech Job Descriptions by Tiffany French. After reading a report and infographic from the World Economic Forum about gender inequity in AI positions, I designed this project to use NLP techniques to assess for bias in job descriptions, that could ultimately lead to the gender inequity we see in hiring. I used web scraping techniques, LDA and PyLDAviz, as well as supervised techniques to gain understanding and identify future areas of research.

  • PyTrader: Algorithmic Trading and Time Series Predictions Using LSTM by Sohaib Khuram. After exploring the capabilitites of time series models through traditional ARIMA methods and LSTM neural networks, I decided to use these models to predict stock price direction and implement algoritmic trading strategies to see how accurate the results are. Using 4 separate strategies based on technical indicators, I was able to create an accurate model using LSTM that closely replicated trade signals around the original data. The strategies were then backtested on Quantopian to see how they performed on historical data.

  • Predicting Chicago Crime by Paul Schmidt. Since the late 1800s, Chicago has been infamously known for its crime rates. Using crime and weather data from 2001 to the present, this project explores the patterns in frequency, type, and location of Chicago's crime for the purpose of informing the Chicago resident and equipping the Chicago Police Department with crime forecasting.

Coursework repositories

  • Please fork this repo, add link and make a pull request to add your repo here.

ds-projects's People

Contributors

conditg avatar ecbenezra avatar etterfred avatar issablin avatar mkfrancsis avatar mricos avatar mu-mandayam avatar paulbenschmidt avatar smellslikecake avatar sohaibk321 avatar trent129 avatar tshaefrench avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.