These files may be downloaded from https://python.g-node.org/wiki/numpy

Please do explore beyond the problems given, and feel free to ask questions at any time.

Note

Solutions to some problems are provided, but try to avoid peeking at them until you've made an attempt!

Before we start, load NumPy:

import numpy as np

These exercises explore some of the more advanced features of NumPy. If you prefer to learn the basics from scratch, have a look at this introduction.

Let's explore the basics of a NumPy array. Create a 3x4 array of random values (using

`np.random.random`), and call it`x`. Create another array as follows:`y = x[2]`. What happens when you modify`y`--does`x`also change?In the array above, have a look at its different properties:

`x.shape`,`x.ndim`,`x.dtype`,`x.strides`. What does each property tell you about the array?Compare the

`strides`property of`x.T`to the above. What is`x.T`and can you explain the new strides?Compute the sum of the columns of

`x`. You may find the function to do that by using`np.<TAB>`in IPython,`print dir(np)`in standard Python or`np.lookfor('sum')`. Remember that you can always look at the docstring of any function in Python (use`np.func_name?<ENTER>`in IPython or`help(np.func_name)`in standard Python).Construct the array

`x = np.array([0, 1, 2, 3], dtype=np.uint8)`(here,`uint8`represents a single byte in memory, an unsigned integer between 0 and 255). Can you explain the results obtained by`x + 1`and`x / 2`? Also try`x.astype(float) + 1`and`x.astype(float) / 2`.What is the maximum number of dimensions a NumPy array can have? Use one of the array constructors (

`np.zeros`,`np.empty`,`np.random.random`, etc.) to find out.

Design a data-type for storing the following record:

- Timestamp in nanoseconds (a 64-bit unsigned integer)
- Position (x- and y-coordinates, stored as floating point numbers)

Use it to represent the following data:

dt = np.dtype(<your code here>) x = np.array([(100, (0, 0.5)), (200, (0, 10.3)), (300, (5.5, 15.1))], dtype=dt)

Have a look at the

`np.dtype`docstring if you need help. After constructing`x`, try to print all the x values for which time is greater than 100 (hint: something of the form`y[y > 100]`).Place the following data in a text file:

% rank lemma (10 letters max) frequency dispersion 21 they 1865844 0.96 42 her 969591 0.91 49 as 829018 0.95 7 to 6332195 0.98 63 take 670745 0.97 14 you 3085642 0.92 35 go 1151045 0.93 56 think 772787 0.91 28 not 1638883 0.98

Now, design a suitable structured data type, then load the data from the text file. Examine the data you got, for example: extract words only, extract the 3rd row, print all words with rank

< 30.Sort the data according to frequency (see

np.sort). Save the result to a Numpy data filesorted.npzwithnp.savezand load back withnp.load. Do you get back what you put in?

Take a look at the following. Many of you may know it as a meshgrid:

x, y = np.mgrid[:10, :5] z = x + y

Now, reproduce

`z`from the above snippet, but this time use NumPy's`ogrid`together with broadcasting.In our solution, broadcasting is used "behind the scenes". To see what happens more explicitly, apply

`np.broadcast_arrays`on the`x`and`y`given by ogrid`ogrid`. This should correspond to the`x`and`y`produced by`mgrid`.Benchmark the two approaches (one using

`mgrid`, the other`ogrid`), using IPython's`timeit`function, on a larger grid. Can you explain the difference in execution time?Given a list of XYZ-coordinates,

`p`,[[1.0, 2.0, 10], [3.0, 4.0, 20], [5.0, 6.0, 30], [7.0, 8.0, 40]]

Normalise each coordinate by dividing with its Z (3rd) element. For example, the first row becomes:

[1/10, 2/10, 10/10]

Hint: extract the last column as

`y`, and then change its dimensions so that`p / y`works.

Create a 3x3 ndarray. Use fancy indexing to slice out the diagonal elements.

Generate a 10 x 3 array of random numbers (in range [0,1]). From each row, pick the number closest to 0.75. Make use of

`np.abs`and`np.argmax`to find the column`j`which contains the closest element in each row.Predict and verify the shape of the following slicing operation.

x = np.empty((10, 8, 6)) idx0 = np.zeros((3, 8)).astype(int) idx1 = np.zeros((3, 1)).astype(int) idx3 = np.zeros((1, 1)).astype(int) x[idx0, idx1, idx2]

**Advanced**: This is not strictly speaking a question on indexing, but it's a fun exercise either way.Construct an array

x = np.arange(12, dtype=np.int32).reshape((3, 4))

so that

`x`isarray([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])

Now, provide to

`np.lib.stride_tricks.as_strided`the strides necessary to view a sliding 2x2 window over this array. The output should bearray([[[[ 0, 1], [ 4, 5]], [[ 1, 2], [ 5, 6]], [[ 2, 3], [ 6, 7]]], [[[ 4, 5], [ 8, 9]], [[ 5, 6], [ 9, 10]], [[ 6, 7], [10, 11]]]], dtype=int32)

The code is of the form

z = as_strided(x, shape=(2, 3, 2, 2), strides=(..., ..., ..., ...))

This sort of stride manipulation is very handy to implement techniques such as region based statistics, convolutions, etc.

Documentation for NumPy's `__array_interface__` may be found in the
online docs.

An author of a foreign package (

`array_interface/mutable_str.py`) provides a string class that allocates its own memory:In [1]: from mutable_str import MutableString In [2]: s = MutableString('abcde') In [3]: print s abcde

You'd like to view these mutable (mutable means the ability to modify in place) strings as ndarrays, in order to manipulate the underlying memory.

- Add an
`__array_interface__`dictionary attribute to`s`, then convert`s`to an ndarray. Use the given`array_interface/template.py`as a guide. - Numerically add "1" to the array. Now print the original string to ensure that its value was modified.

- Add an

Construct the following two arrays:

x = np.array([[1, 2], [3, 4]], order='C', dtype=np.uint8) y = np.array([[1, 2], [3, 4]], order='F', dtype=np.uint8)

Compare the bytes they store in memory by using

[ord(c) for c in x.data]

Note that, even though these arrays store data in different memory order, they are identical from the user's perspective.

print x print y

Examine the bytes stored by the following array (using the "ord" trick shown above).

x = np.array([[1, 2], [3, 4]], dtype=np.uint32)

Note that, on most laptops, the byte order will be little Endian, i.e. least significant byte first.

Attempt the Fortran-ordering quiz.