Notebooks for Master of Data Science Rennes
You can run these notebooks with Docker. The following command starts a container with the Notebook server listening for HTTP connections on port 8888 and 4040 without authentication configured.
git clone https://github.com/pnavaro/big-data.git
docker run --rm -d -v $HOME/big-data:/home/jovyan/ -p 8888:8888 -p 4040:4040 pnavaro/big-data
- Python for Data Analysis by Wes McKinney.
- Python Data Science Handbook by Jake VanderPlas
- Analyzing and Manipulating Data with Pandas Beginner: SciPy 2016 Tutorial by Jonathan Rocher.
- Parallelizing Scientific Python with Dask, SciPy 2017 Tutorial by James Crist.
- Writing an Hadoop MapReduce Program in Python by Michael G. Noll.
- Parallel Data Analysis in Python, SciPy 2017 Tutorial by Matthew Rocklin, Ben Zaitlen & Aron Ahmadia.
- Parallel Python: Analyzing Large Datasets Intermediate, SciPy 2016 Tutorial by Matthew Rocklin.
- Should you replace Hadoop with your laptop? by Vicki Boykis.
- Implementing MapReduce with multiprocessing by Doug Hellmann.
- DataCamp Cheat Sheets
- Outils pour le Big Data by Pierre Nerzic. ๐ซ๐ท
- wikistat - Ateliers Big Data by Philippe Besse. ๐ซ๐ท
Pierre