GithubHelp home page GithubHelp logo

uberwach / kaggle-berlin Goto Github PK

View Code? Open in Web Editor NEW
35.0 10.0 10.0 2.7 MB

Material of the Kaggle Berlin meetup group!

data-science fun learning-by-doing kaggle machine-learning feature-engineering xgboost

kaggle-berlin's Introduction

kaggle-berlin

Material of the Kaggle Berlin meetup group!

Collection of Sources

If you want a comprehensive introduction to the field you find decent advice [here]. Note that this is a guide for AI safety yet the areas outlined with books and sources is fairly decent.

Here is a small, but growing, collection of sources that we have been discussing on our hack sessions.

Star ratings are from ⭐ to ⭐⭐⭐⭐⭐ and subject of discussions in the Kaggle group.

Tutorials

[0] Nicolas P. Rougier, Python & Numpy [link] (Outstanding Numpy introduction for scientists and optimizers) ⭐⭐⭐⭐⭐

[1] Sebastian Ruder, gradient descent methods [link] (If you are wondering what it is all about stochastic gradient descent, Nesterov momentum, Adam, ...) ⭐⭐⭐

[2] Scikit-learn documentation [link] (Absolutely great read to start learning about specific topics. Tons of superb example code. When I am bored I spend time here!) ⭐⭐⭐⭐⭐

[3] Donne Martin, "Data Science iPython notebooks." [Github repository] (Some useful examples to learn from.) ⭐⭐

Toying and Fun (but still learning)

[0] Andrew Karpathy: CNNs in the browser [link] (Great to gain some intuition.) ⭐⭐⭐

[1] Loss Function tumblr [link] (If you do not suffer from PTSD from neural network training already ;)) ⭐⭐⭐⭐

[2] Tensorflow in the browser [link] (Start with this when you learn about NNs!) ⭐⭐⭐⭐

[3] Narayanan, Arvind; Shmatikov, Vitaly: Robust De-anonymization of Large Sparse Datasests [paper] (Ridiculous example of de-anonymization - this should make you very afraid! Anonymous identities in the Netflix challenge data set are discovered via public available data on IMDB.)

[4] IBM personality insights [link] (Maps text to big five personality traits with Twitter or free text input. Supports English, Spanish, Arabic, and Japanese.) ⭐⭐⭐⭐

[5] Visualizing the DBScan algorithm [link] (Underrated clustering algorithm, only K-means and DBScan are useful bread-and-butter clustering algorithms.) ⭐⭐⭐⭐

Practical Tips

[0] Aarshay Jain: Complete Guide to Parameter Tuning in XGBoost (with codes in Python) [link] (XGBoost won many Kaggle competitions and is from the gradient boosted tree-based model family.)

[1] HJ van Veen: Feature Engineering [slideshare] (Read this to understand basics of preprocessing and feature engineering!) ⭐⭐⭐⭐

[2] hat y: Kaggle Ensembling Guide [link] (You must learn on how to combine several submission files and stack several models together if you want to score highly in contests.) ⭐⭐⭐⭐

[3] Megan Risdal: Communicating Data Science [kaggle blog] (Communication of your results is one of the major skills you have to learn - and you can exercise it in our group! It is a good summary of communication, presentation, and visualization.) ⭐⭐⭐

[4] Tim Dettmers: Which GPU(s) to Get for Deep Learning. [article] (Excellent guide on how to build your GPU machine, what to look for, and why cloud is too expensive) ⭐⭐⭐⭐

Books

[0] Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. "Deep learning." An MIT Press book. (2015). [pdf] (Good theory book to get started, modern! Then go to papers.) ⭐⭐⭐⭐

[1] Murphy, Kevin. "Machine Learning" An MIT Press book. (2012) [link] (Not a good starter book, comprehensive and mathematics heavy. I use this a reference manual) ⭐⭐⭐

[2] Bishop, Christopher. "Pattern Recognition and Machine Learning" Springer. (2008) [link] (Written like a typical CS book, a bit outdated but solid introduction.) ⭐⭐⭐

[3] Abu-Mostafa, Yaser "Learning From Data" AMLBook (2012) [class site] (If you have only two months to learn ML, also has an accompanied class at Caltech.) ⭐⭐⭐

kaggle-berlin's People

Contributors

uberwach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.