GithubHelp home page GithubHelp logo

rougier / from-python-to-numpy Goto Github PK

View Code? Open in Web Editor NEW
2.0K 57.0 334.0 41.43 MB

An open-access book on numpy vectorization techniques, Nicolas P. Rougier, 2017

Home Page: http://www.labri.fr/perso/nrougier/from-python-to-numpy

License: Other

Python 98.32% CSS 1.64% Shell 0.03%
python numpy vectorization book open-access cc-by-nc-sa

from-python-to-numpy's Introduction

From Python to Numpy

Copyright (c) 2017 Nicolas P. Rougier
License: Creative Commons Attribution 4.0 International (CC BY-NC-SA 4.0).
Website: http://www.labri.fr/perso/nrougier/from-python-to-numpy

There are already a fair number of books about NumPy (see bibliography) and a legitimate question is to wonder if another book is really necessary. As you may have guessed by reading these lines, my personal answer is yes, mostly because I think there is room for a different approach concentrating on the migration from Python to NumPy through vectorization. There are a lot of techniques that you don't find in books and such techniques are mostly learned through experience. The goal of this book is to explain some of these techniques and to provide an opportunity for making this experience in the process.

from-python-to-numpy's People

Contributors

adonese avatar brongulus avatar bruno314 avatar corea avatar davidbradway avatar deeplook avatar dimiarbre avatar dr-neptune avatar edwardbetts avatar eitanlees avatar ev-br avatar gallyamb avatar i-namekawa avatar jessicayung avatar jiong3 avatar joymonteiro avatar jwilk avatar lookfwd avatar mdboom avatar pa-beaufort avatar pdebuyl avatar rougier avatar seands avatar seberg avatar skovorodkin avatar tavito avatar tommyod avatar verginer avatar vladislavneon avatar wp-lai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

from-python-to-numpy's Issues

4.4 Spatial vectorization/numpy implementation incorrect operators

There are three blocks containing code for the three rules.

This is a suspicious line used for normalizing in all the blocks:
target *= np.divide(target, norm, out=target, where=norm != 0)

I suppose it must be just = instead of *=.
In my opinion this assignment is redundant, taking into account out argument of np.divide.

Code vectorization chapter

  • Introduction
  • Uniform vectorization
  • Differential vectorization (temporal)
  • Differential vectorization (spatial)
  • Conclusion

Typos 3.3

Thus, if you need fancy indexing, it's better to keep a copy of your fancy index (especially if it was complex to compute it) and to work with it:

...

If you are unsure if the result of your indexing is a view or a copy, you can check what is the base of your result. If it is None, then you result is a copy:

...

However, if your arrays are big, then you have to be careful with such expressions and wonder if you can do it differently

Glumpy section incomplete

The Glumpy section is missing some text:

Glumpy

Glumpy is an OpenGL-based interactive visualization library in Python. Its goal is to make it easy to create fast, scalable, beautiful, interactive and dynamic visualizations. The main documentation for the site is organized into a couple of sections:

7.4 Conclusion

First vectorized approach in intro is incorrect algorithm.

So I started reading the book and have gotten to the introduction so far (section 2.1). I noted a couple of issues and thought "I can help!...kinda".

The first vectorized approach to the random walk, which used the code steps = random.sample([1, -1]*n, n) does not in fact produce a random walk.

This is because the random.sample() function performs sampling without replacement. All other random walk definitions I have seen, including the rest of the ones in the introduction, perform their sampling with replacement.

In order to bring this implementation into line with the others in the introduction I might suggest using the random.choices() function.

steps = random.choices([1,-1], k=n)

I also noticed a minor notational issue that could use some clarification. I believe that many people would refer to the "Functional approach" as a "Procedural approach". The word "Functional" is sometimes reserved for the functional programming style which makes heavy use of higher order functions. I cannot say, however, how universal such terminology is.

Looking forward to reading onward.

Syntax error on 3.3

In section 3.1, in the example on using .base for checking whether an object is a view or a copy of the original, there is a syntax error:

Z = np.random.uniform(0,1,(5,,5))

This isn't present in the repository, but for some reason is still present in the text on the book website.

Anatomy of an array introduction. Obvious way is the fastest.

Hello,
I've tried this code:

Z = np.ones(4 * 1000000, np.float32)
timeit("Z[...] = 0", globals())
timeit("Z.view(np.float16)[...] = 0", globals())
timeit("Z.view(np.int16)[...] = 0", globals())
timeit("Z.view(np.int32)[...] = 0", globals())
timeit("Z.view(np.float32)[...] = 0", globals())
timeit("Z.view(np.int64)[...] = 0", globals())
timeit("Z.view(np.float64)[...] = 0", globals())
timeit("Z.view(np.complex128)[...] = 0", globals())
timeit("Z.view(np.int8)[...] = 0", globals())

And gave following results:
100 loops, best of 3: 905 usec per loop
100 loops, best of 3: 918 usec per loop
100 loops, best of 3: 925 usec per loop
100 loops, best of 3: 915 usec per loop
100 loops, best of 3: 910 usec per loop
100 loops, best of 3: 912 usec per loop
100 loops, best of 3: 902 usec per loop
100 loops, best of 3: 1.9 msec per loop
100 loops, best of 3: 1.91 msec per loop

And i don't understand the root cause of such opposite results. Could you kindly clarify?
Thanks in advance.

P.S. I'm using python 3.5.2 64bit version along with Anaconda.
The sysinfo() output:
Date: 01/02/17
Python: 3.5.2
Numpy: 1.11.1
Scipy: 0.17.1
Matplotlib: 1.5.1

Maybe use ReST include directives

... if there's no policy of having everything in the ReST files. If not, includes would minimise the risk of code files and doc files being out of sync. And the ReST include directive has many options to pick only what you want, if you don't want to shoe the entire file. In the simplest case an include looks like this:

.. include:: code/anatomy.py
    :code: python

See http://docutils.sourceforge.net/docs/ref/rst/directives.html#including-an-external-document-fragment

Preface chapter

  • About the author
  • About this book
  • Pre-requisites
  • Conventions
  • License

colloquial #2

Consider changing:
"""
What would be the best way? The syntax below is rather obvious (at least for who is familiar with numpy) but the question is to known whether this is the fastest way.
"""
To:
"""
How does one write it to maximize speed? The below syntax is rather obvious (at least for those familiar with numpy) but the above question asks to find the fastest operation.
"""

Horizontal scroll appears

On my laptop's screen (viewport is 1304x702) the book have a horizontal scroll. Do anyone else have this issue?
Horizontal scroll issue

Translation to other languages

Is there any proper way to translate this book to other language?
I would like to do that in Korean whenever I have some free time :)

Make available as python notebooks

Viewing the book inside Jupiter makes just so much sense, as you can still browse the book just like now, but being able to just play with all the examples inline (and also have the output computed instead of written) should lead to less bugs and even better examples.

So, is there a way to get this converted to jupyter notebook format?

colloquial #1

Change:
"""
If you don't comment your code at the time of writing, you'll be unable to guess what a function is doing after a few weeks (or even days)
"""
To:
"""
If you don't comment your code at the time of writing, you won't be able to tell what a function is doing after a few days (or even weeks)
"""

2.1 Introduction random walk example

Hi,
I cannot wrap my head around the big difference between the execution times when I use timeit() from tools.py versus the ipython magic function %timeit. can someone help me?

Regards,
fonis.

%timeit "random_walk_fastest(n=10000)"
7.52 ns ± 0.258 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

timeit("random_walk_fastest(n=10000)", globals())
1000 loops, best of 3: 99 usec per loop

mention einsum

Dear Dr Nicolas P. Rougier,

I think it would be great if you could mention einsum, tensordot, kron and other powerful but less used numpy functions

2.2 Readability VS Speed

I'm facing issues with the below code -

def function_2(seq, sub):
    target = np.dot(sub, sub)
    candidates = np.where(np.correlate(seq, sub, mode='valid') == target)[0]
    check = candidates[:, np.newaxis] + np.arange(len(sub))
    mask = np.all((np.take(seq, check) == sub), axis=-1)
    return candidates[mask]

function_2('shasdefiran','def')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-69-3a3e84d6ff06> in <module>
----> 1 function_2('shasdefiran','def')

<ipython-input-64-dc47768a25a7> in function_2(seq, sub)
      1 def function_2(seq, sub):
----> 2     target = np.dot(sub, sub)
      3     candidates = np.where(np.correlate(seq, sub, mode='valid') == target)[0]
      4     check = candidates[:, np.newaxis] + np.arange(len(sub))
      5     mask = np.all((np.take(seq, check) == sub), axis=-1)

TypeError: ufunc 'multiply' did not contain a loop with signature matching types dtype('<U3') dtype('<U3') dtype('<U3')

Please help me understand the issue with the above code.

4.2. Uniform vectorization/Python implementation bug

There's a shape inconsistency in compute_neighbours function:

def compute_neighbours(Z):
    shape = len(Z), len(Z[0])
    N  = [[0,]*(shape[0]) for i in range(shape[1])]
    for x in range(1,shape[0]-1):
        for y in range(1,shape[1]-1):
            N[x][y] = Z[x-1][y-1]+Z[x][y-1]+Z[x+1][y-1] \
                    + Z[x-1][y]            +Z[x+1][y]   \
                    + Z[x-1][y+1]+Z[x][y+1]+Z[x+1][y+1]
    return N

Shape of N is currently a transposed shape of Z, which would result in index of out range exception in case of non-square field.

Not critical for the example, but a bit confusing anyway.

Fix code snippet in Chapter 3.3 (Temporary copy)

There are the following code snippet:

>>> X = np.ones(1000000000, dtype=np.int)
>>> Y = np.ones(1000000000, dtype=np.int)
>>> timeit("X = X + 2.0*Y", globals())
100 loops, best of 3: 3.61 ms per loop
>>> timeit("X = X + 2*Y", globals())
100 loops, best of 3: 3.47 ms per loop
>>> timeit("X += 2*Y", globals())
100 loops, best of 3: 2.79 ms per loop
>>> np.add(X, Y, out=X), np.add(X, Y, out=X),
1000 loops, best of 3: 1.57 ms per loop

I think, that the last line of Python code should be timeit('np.add(X, Y, out=X); np.add(X, Y, out=X)', globals()). Isn't it?

Missing Highlight CSS ?

You pre block seem to have the correct classes, though do not seem to be highlighted.

I'm going to guess that this is dues to some path issue as it tries to include some css at weird paths:

<link rel="stylesheet" href="/usr/locaL/lib/python2.7/site-packages/docutils/writers/html4css1/math.css" type="text/css" />
No highlighting on purpose is also a valid answer.

In Introduction 2.1 Simple Example - most benefit is not from vectorization

In chapter "2.1 Simple example" you have example of "Vectorized approach" which leaves impression that most performance benefits come from itertools.accumulate(). This is not true - the main speed gain comes from use of random.choices() instead of random.randint() in previous sample.

>>> import random
>>> from timeit import timeit

>>> # original example
... def random_walk(n):
...     position = 0
...     walk = [position]
...     for i in range(n):
...         position += 2*random.randint(0, 1)-1
...         walk.append(position)
...     return walk
...
>>>
>>> timeit("random_walk(n=10000)", number=100, globals=globals())
1.7758303454734055
>>> # lets use random.choices() instead
... def random_walk_with_choices(n):
...     position = 0
...     steps = random.choices([1,-1], k=n)
...     walk = [position]
...     for step in steps:
...         position += step
...         walk.append(position)
...     return walk
...
>>> timeit("random_walk_with_choices(n=10000)", number=100, globals=globals())
0.39364298073223836
>>>
>>> # original random_walk_faster with accumulate
... def random_walk_faster(n=1000):
...   from itertools import accumulate
...   steps = random.choices([1,-1], k=n)
...   return list(accumulate(steps))
...
>>> timeit("random_walk_faster(n=10000)", number=100, globals=globals())
0.264255993680635

It is clearly visible that accumulate has minor effect on overall speed (which is mainly driven by itertools() native implementation).

As such example is clearly misleading.

Another minor bug with example: random_walk() returns N+1 elements while random_walk_faster() returns N elements.

len(random_walk(1000))
1001
len(random_walk_faster(1000))
1000

Typo in Section 3.2 Memory Layout - len(shape)

Section 3.2, Third paragraph:

Here, we know that Z itemsize is 2 bytes (int16), the shape is (3,3) and the number of dimensions is 2 (len(shape)).

I believe the command should be len(Z.shape). Of course, to correspond with the code sample which comes immediately after, you could instead have Z.ndim.

[4.1 Introduction]: why `add_python` is faster than `add_numpy` for vectorization `add`

I found an opposite conclusion when running the example code in 4.1 Introduction, following code is my results tested in IPython 6.4.0 with Python 3.6.5 and Numpy 1.14.3:

In [1]: import numpy as np

In [2]: import random

In [3]: def add_python(Z1,Z2):
   ...:     return [z1+z2 for (z1,z2) in zip(Z1,Z2)]
   ...: 
   ...: def add_numpy(Z1,Z2):
   ...:     return np.add(Z1,Z2)
   ...: 

In [4]: Z1 = random.sample(range(1000), 100)

In [5]: Z2 = random.sample(range(1000), 100)

# For Python lists `Z1`, `Z2`, `add_python` is faster
In [6]: %timeit add_python(Z1, Z2)
8.25 µs ± 205 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: %timeit add_numpy(Z1, Z2)
16.9 µs ± 235 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [8]: a = np.random.randint(0, 1000, size=100)

In [9]: b = np.random.randint(0, 1000, size=100)
# For Numpy array `a`, `b`, `add_numpy` is faster
In [10]: %timeit add_python(a, b)
22.6 µs ± 816 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [11]: %timeit add_numpy(a, b)
851 ns ± 6.37 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Typos in 7.3

Excellent work!

I noticed two minor grammatical issues in section 7.3, "Scipy & co" (file 07-beyond-numpy.rst):

  • "there is a trillion" -> "there are a trillion"
  • "it was not the goal" -> "that was not the goal"

find_index

Hello,

When I run the script find_index.py, I get the following error

 $ python find_index.py 
Traceback (most recent call last):
  File "find_index.py", line 68, in <module>
    index = find_index(base,Z)
NameError: name 'find_index' is not defined

If I replace find_index with find_view within the scope of __main__, it seems to work

 $ python find_index.py 
True
True
True
True
True
True
True

3.4 Conclusion - Shouldn't byte_bounds(Z1)[0] point to the beginning of the 0-th element?

Currently byte_bounds(Z1)[0] points to the beginning of the 1 element:

  byte_bounds(Z1)[0]                  byte_bounds(Z1)[-1]
          ↓                                   ↓ 
   ╌╌╌┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬╌╌
Z1    │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │
   ╌╌╌┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴╌╌

[But] byte_bounds's docstring says, that "the first integer is the first byte of the array" and:

>>> I = np.eye(2, dtype='f'); I.dtype
dtype('float32')
>>> low, high = np.byte_bounds(I)
>>> high - low == I.size*I.itemsize
True

So it looks like the pointer on the diagram should point before the 0-th element not after:

  byte_bounds(Z1)[0]                  byte_bounds(Z1)[-1]
      ↓                                       ↓ 
   ╌╌╌┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬╌╌
Z1    │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │
   ╌╌╌┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴╌╌

Grammar corrections in Section 4.1

Section 4.1 Introduction

  1. In the second sentence: The phrase

"Of course it does not mean it is easy nor straightforward, but at least it does not necessitate to totally rethink your problem..."

would be replaced by

"Of course it does not mean it is easy or straightforward, but at least it does not necessitate totally rethinking your problem..."

  1. In the fourth sentence: "most simple" should be "simplest". My personal preference here would be to replace "the simplest example" by "a simple example", but then it is no longer a suggested correction!

Typo in Section 7.1 Back to Python

In the paragraph after the code for solution_1 in Section 7.1, 11641 should be 14641:

This solution is the slowest solution because it requires 4 loops, and more importantly, it tests all the different combinations (11641) -> (14641) of 4 integers between 0 and 10 to retain only combinations whose sum is 10. We can of course get rid of the 4 loops using itertools, but the code remains slow:

Thanks!

2.1 Introduction - Simple Example, procedural example

The random_walk function is not a direct equivalent to the RandomWalker class method. A strict equivalent would be this:

def random_walk(n):
    position = 0
    for i in range(n):
        yield position
        position += 2*random.randint(0,1)-1

It is still not much faster, but it's a more fair comparison.

Which font/typeface being used for the diagrams?

When trying to compile to PDF from RST using pandoc, I've been encountering several error messages of this type:

[WARNING] Missing character: There is no ΓÇà in font [lmroman10-mono]

Other glyphs that render improperly or not at all include ∑, π, ∇. These seem mostly to be the borders of the tables that are in the various diagrams at the end of the book.

Which font was used when writing the book? I'd like to be able to hunt down which specific unicode is failing to render properly and recompile using a font that isn't missing those glyphs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.