GithubHelp home page GithubHelp logo

webartifex / intro-to-python Goto Github PK

View Code? Open in Web Editor NEW
881.0 27.0 87.0 10.56 MB

An intro to Python & programming for wanna-be data scientists

Home Page: https://gitlab.webartifex.biz/alexander/intro-to-python

License: MIT License

Jupyter Notebook 98.64% Python 1.36%
introduction-to-programming python jupyter data-science tutorial

intro-to-python's Introduction

An Introduction to Python and Programming

This project is a thorough introductory course in programming with Python .

Table of Contents

The following is a high-level overview of the contents. For a more detailed version with clickable links see the CONTENTS.md file.

  • Chapter 0: Introduction
  • Part A: Expressing Logic
    • Chapter 1: Elements of a Program
    • Chapter 2: Functions & Modularization
    • Chapter 3: Conditionals & Exceptions
    • Chapter 4: Recursion & Looping
  • Part B: Managing Data and Memory
    • Chapter 5: Numbers & Bits
    • Chapter 6: Text & Bytes
    • Chapter 7: Sequential Data
    • Chapter 8: Map, Filter, & Reduce
    • Chapter 9: Mappings & Sets
    • Chapter 10: Arrays & Dataframes
    • Chapter 11: Classes & Instances

Videos

Presentations of the chapters are available on this YouTube playlist . The recordings are about 25 hours long in total and were made in spring 2020 after a corresponding in-class Bachelor course was cancelled due to Corona.

Objective

The main goal is to prepare students for further studies in the "field" of data science, including but not limited to topics such as:

  • algorithms & data structures
  • data cleaning & wrangling
  • data visualization
  • data engineering (incl. SQL databases)
  • data mining (incl. web scraping)
  • linear algebra
  • machine learning (incl. feature generation & deep learning)
  • optimization & (meta-)heuristics (incl. management science & operations research)
  • statistics & econometrics
  • quantitative finance (e.g., option valuation)
  • quantitative marketing (e.g., customer segmentation)
  • quantitative supply chain management (e.g., forecasting)
  • web development (incl. APIs)

Prerequisites

To be suitable for beginners, there are no formal prerequisites. It is only expected that the student has:

  • a solid understanding of the English language,
  • knowledge of basic mathematics from high school,
  • the ability to think conceptually and reason logically, and
  • the willingness to invest around 90-120 hours on this course.

Getting started

If you are a total beginner, follow the instructions in the "Installation" section next. If you are familiar with the git and poetry command-line tools, you may want to look at the "Alternative Installation" section further below.

Installation

To follow this course, an installation of Python 3.11 or higher is expected.

A popular and beginner friendly way is to install the Anaconda Distribution that not only ships Python itself but also comes pre-packaged with a lot of third-party libraries.

Scroll down to the "Anaconda Installers" section and install the latest version for your operating system (i.e., 2024-02 with Python 3.11 at the time of this writing).

After installation, you find an entry "Anaconda Navigator" in your start menu. Click on it.

A window opens giving you several options to start various applications. In the beginning, we will work mostly with JupyterLab. Click on "Launch".

A new tab in your web browser opens: The website is "localhost" and some number (e.g., 8888).

This is the JupyterLab application that is used to display the course materials. On the left, you see the files and folders on your computer. This file browser works like any other. In the center, you see several options to launch (i.e., "create") new files.

To check if your Python installation works, double-click on the "Python 3" tile under the "Notebook" section. That opens a new Jupyter notebook named "Untitled.ipynb".

Enter some basic Python in the code cell, for example, 1 + 2. Then, press the Enter key while holding down the Control key (if that does not work, try with the Shift key) to execute the snippet. The result of the calculation, 3 in the example, shows up below the cell.

After setting up Python, click on the green "Code" button on the top right on this website to download the course materials. As a beginner, choosing "Download ZIP" is likely the easiest option. Then, unpack the ZIP file into a folder of your choice, ideally somewhere within your personal user folder so that the files show up right away in JupyterLab.

Alternative Installation (for Instructors using Linux)

Python can also be installed in a "pure" way obtained directly from its core development team here. Then, it comes without any third-party packages, which is not a problem at all. Managing third-party packages can be automated to a large degree, for example, with tools such as poetry.

However, this may be too "advanced" for a beginner as it involves working with a command-line interface (CLI), also called a terminal, which looks like the one below. It is used without a mouse by typing commands into it. The following instructions assume that git, poetry, and pyenv are installed.

The screenshot above shows how this project can be set up in an alternative way with the zsh CLI.

First, git is used to clone the course materials as a repository into a new folder called "intro-to-python" that lives under a "repos" folder.

  • git clone https://github.com/webartifex/intro-to-python.git

The cd command is used to "change directories".

In the screenshot, pyenv is used to set the project's Python version. pyenv's purpose is to manage many parallel Python installations on the same computer. It is highly recommended for professional users; however, any other way of installing Python works as well.

  • pyenv local ...

On the contrary, poetry's purpose is to manage third-party packages within the same Python installation and, more importantly, on a per-project basis. So, for example, whereas "Project A" may depend on numpy v1.19 from June 2020 be installed, "Project B" may use v1.14 from January 2018 instead (cf., numpy's release history). To achieve this per-project isolation, poetry uses so-called virtual environments behind the scenes. While one could do that manually, for example, by using Python's built-in venv module, it is more convenient and reliable to have poetry automate this. The following one command not only creates a new virtual environment (manually: python -m venv venv) and activates it (manually: source venv/bin/activate), it also installs the versions of the project's third-party dependencies as specified in the poetry.lock file (manually: python -m pip install -r requirements.txt if a requirements.txt file is used; the python -m part is often left out but should not be):

  • poetry install

poetry is also used to execute commands in the project's (virtual) environment. To do that, the command is prefixed with poetry run ....

The project uses nox to manage various maintenance tasks. After cloning the repository and setting up the virual environment, it is recommended to run the initialization task. That needs to be done only once.

  • poetry run nox -s init-project

To do the equivalent of clicking "Launch" in the Anaconda Navigator:

  • poetry run jupyter lab

This opens a new tab in your web browser just as above. The command-line interface stays open in the background, like in the screenshot below, and prints log messages as we work in JupyterLab.

Contributing

Feedback is highly encouraged and will be incorporated. Open an issue in the issues tracker or initiate a pull request if you are familiar with the concept. Simple issues that anyone can help fix are, for example, spelling mistakes or broken links. If you feel that some topic is missing entirely, you may also mention that. The materials here are considered a permanent work-in-progress.

A "Show HN" post about this course was made on Hacker News and some ideas for improvement were discussed there.

About the Author

Alexander Hess is a PhD student at the Chair of Logistics Management at WHU - Otto Beisheim School of Management where he conducts research on urban delivery platforms and teaches coding courses based on Python in the BSc and MBA programs.

Connect him on LinkedIn.

intro-to-python's People

Contributors

webartifex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

intro-to-python's Issues

Rearrange contents in chapter 8

https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/01_content.ipynb

Move parts on list comprehensions into chapter 7 where list constructors are covered (or after the part on list methods because of .append()).

Move the nested and Cartesian product examples into exercises on their own.

Convert "Example: Averaging all even Numbers in a List (revisited)" section into a new section on "Streaming data".

Move tuple comprehensions part up into generator section.

Create new section on generator functions (make a unified generator section with expression and function as sub-section).
Note that the numeric example is not the same as in the first content file.

Solve Towers of Hanoi with stacks

In a future chapter on common data structures, re-visit Towers of Hanoi and solve it with stacks.

cf. "Classic Computer Science Problems with Python", pp. 22

Use shorter words

For example, the text usually writes out dict objects. That could be abbreviated as dicts or dictionaries to make the text easier to read.

Add content to notebook on text

  • bytes (and bytearray)
  • explain unicode
  • encode() vs decode()
  • normalizing unicode, case folding, sorting
  • maybe base64 encoding

Grammar Error in Chapter 4: Recursion & Looping Exercise

Q3: Complete the for-loop below such that it runs 100000 times! In the body, use your answer to Q2 to simulate a single throw of the fair_die and update the corresponding count in throws!

Hints: You need to use the indexing operator [] and calculate an index in each iteration of the loop. Do do not actually need the target variable provided by the for-loop and may want to indicate that with an underscore _.

I believe you mean You do not

Introduce the term contiguous array in chapter 07 (lists)

Explain what a contiguous array is and that list (slots) and dict (buckets) and other use it behind the scenes.

Explain offsetting: "It does not have to follow a reference to some "random" memory location once it has followed the reference to the dict object's "start" in memory."

Typos in Chapter 2 Content

We created a function object, dit not call it, and Python immediately forgot about it. So what's the point?

Presumably this should be "did not call it..."

sum and len are no keywords Image like for or if but variables that reference objects in memory.

Presumably this should read "sum and len are not keywords..."

Working slowly through the material and really enjoying it - thank you again for making this available!

Chapter 5 - Typo

The decimal system is intuitive to us humans, mostly as we learn to count with our ten fingers. The 0s and 1s in a computer's memory are therefore no rocket science; they only feel unintuitive for a beginner

Should this be "...NOT rocket science"

Exchange chapters 8 and 9

with new section on abstract data types it makes sense to move iterators behind mappings & sets part.

some exercises in chapter 9 assume generators in hints => should not cause troubles.

rewrite first part of MFR content and show the map() and filter() example first using for-loops and temporary list objects.

Include Youtube Link in jupyter notebook lecture

Hi Alex,

Found your course on Hacker News - really excited about spending time learning Python. Please include the Youtube Links in the actual jupyter notebook lectures. Right now the only place with the links is on Hacker News, if you find this course through other channels, you will not know that there are corresponding video lectures on Youtube.

Also - I'm not sure if this is covered in the course yet, but a short introduction to Git would also be useful!

Thanks!

Grammar Issue: Chapter 8 Content

Let's assign the object to which the generator expression below evaluates to to a variable and inspect it.

Should this read to just a single 'to'?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.