Python
gendermodel.py, genderclassmodel.py, and myfirstforest.py are used in the Titanic tutorial for Python at this link: https://www.kaggle.com/c/titanic-gettingStarted/data?gendermodel.py
QSTK_tutorial.py is for the Computational Investing course as seen in the link below: http://wiki.quantsoftware.org/index.php?title=CompInvestI_Homework_1
scrape.py is a web scraping Python script
LineChart_VORPvsWsp.py is similar to the statistical analysis I did using R
The files in the "MachineLearning" folder are code files from the book Programmer's Guide to Data Mining, http://guidetodatamining.com In the filteringdata.py file, I implemented the Euclidean and CompareDistances functions and replaced users with decks with data obtained from http://yugioh.tcgplayer.com/db/deck_search_result.asp
In the NBA folder, I performed statistical analysis on NBA data, similar to what I worked on using R. I also included the SVM_wins.py file, which attempts using Support Vector Machine to try to predict which NBA teams playing against each other will win. The raw data can be found at http://www.basketball-reference.com/
Each sample (NBA12_14, NBA14_playoffs, etc) represents as the result of a game and has 16 features. The first column indicates whether the road team won or not. 1 means win and 0 means loss. The features include offense and defense factors (Effective Field Goal Percentage, Turnover Percentage, Offensive Rebound Percentage and Free Throws Per Field Goal Attempt) for both the road and home teams.