A series of little puzzles and training exercises aimed at improving python knowledge for data science
This repository is a collection of small training exercises aimed at keep relevant data science skills honed. The focus is mainly on data manipulation and visualisation, but may branch into other topics as well.
Each Jupyter Notebook in the main folder contains five assignments. The first four assigments are centred around a topic, while the fifth assigment purposefully off-topic. Usually the fifth is some nifty little trick, which may come in handy in a data science workflow.
In order to get started you need a python installation and Jupyter Notebooks. It is assumed, that you have this and know how to work with it. After that you can merely take the notebooks one at a time. If you do not have python installed, the Anaconda distribution can be recommended.
Should you get stuck, there is a corresponding notebook in the solutions folder with a worked through example.
You are of course always welcome to open an issue here on GitHub, if any of the exercises do not make sense. Chances are you are not the only one facing an issue.
- pandas_1: DataFrame shape, columns and data types
- pandas_2: Group By
- pandas_3: Merge and Concatenation
- pandas_4: Missing values
- pandas_5: I/O
- pandas_6: Melt and Pivot
- pandas_7: Time series - Part I: Creating and filtering on DatetimeIndex
- pandas_8: Time series - Part II: Custom frequency on DatetimeIndex
- pandas_9: Time series - Part III: Offsetting DatetimeIndex
The more exercises there are in a repo like this, the better it is. Everyone is welcome to contribute, as long as they follow the structure mentioned above and in the contribution guidelines
Even if you are not up for contributing exercises, you can always add a topic to the wishlist in the wiki!!
Enjoy the exercises, and let me know if anything is left wanting.
/Philip