Last week we were working with containers:
This week we will be working with NumPy.
NumPy is a library for the Python, adding:
Using NumPy in Python gives functionality comparable to MATLAB. They both allow the user to write fast programs as long as most operations work on arrays or matrices instead of scalars.
Complementary Python packages are available:
Internally, both MATLAB and NumPy rely on BLAS and LAPACK for efficient linear algebra computations.
In this notebook we:
np.dot(,)
Later I'll show you how to use
these are really advanced topics.
There is maybe one place where I use a for
loop -
usually when we use numpy
we try and avoid loops
because they are slow.
Later we will see how to avoid using conditionals like if statements.
import numpy as np
X = np.arange(10)
type(X)
numpy.ndarray
reshaping an array is an important operation that we'll use a lot
X.reshape(2,5)
array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]])
X.reshape(5,2)
array([[0, 1], [2, 3], [4, 5], [6, 7], [8, 9]])
Y = X.reshape((5,2))
nearly back to the original array
Y.reshape(1,-1)
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
An array object represents a multidimensional, homogeneous array of fixed-size items. An associated data-type object describes the format of each element in the array (its byte-order, how many bytes it occupies in memory, whether it is
Arrays can be constructed using
Y = np.array([[1,2],[3,4]])
Y[:,0], Y[0,:].reshape(-1,1)
(array([1, 3]), array([[1], [2]]))
Y, Y.dot(_[1])
(array([[1, 2], [3, 4]]), array([[ 5], [11]]))
np.identity(4)
array([[1., 0., 0., 0.], [0., 1., 0., 0.], [0., 0., 1., 0.], [0., 0., 0., 1.]])
np.arange(10).shape
(10,)
np.ones(6,dtype=np.int32)
array([1, 1, 1, 1, 1, 1], dtype=int32)
np.ones((2,3), dtype=np.int32)
array([[1, 1, 1], [1, 1, 1]], dtype=int32)
?np.ones
Signature: np.ones(shape, dtype=None, order='C', *, like=None) Docstring: Return a new array of given shape and type, filled with ones. Parameters ---------- shape : int or sequence of ints Shape of the new array, e.g., ``(2, 3)`` or ``2``. dtype : data-type, optional The desired data-type for the array, e.g., `numpy.int8`. Default is `numpy.float64`. order : {'C', 'F'}, optional, default: C Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory. like : array_like Reference object to allow the creation of arrays which are not NumPy arrays. If an array-like passed in as ``like`` supports the ``__array_function__`` protocol, the result will be defined by it. In this case, it ensures the creation of an array object compatible with that passed in via this argument. .. versionadded:: 1.20.0 Returns ------- out : ndarray Array of ones with the given shape, dtype, and order. See Also -------- ones_like : Return an array of ones with shape and type of input. empty : Return a new uninitialized array. zeros : Return a new array setting values to zero. full : Return a new array of given shape filled with value. Examples -------- >>> np.ones(5) array([1., 1., 1., 1., 1.]) >>> np.ones((5,), dtype=int) array([1, 1, 1, 1, 1]) >>> np.ones((2, 1)) array([[1.], [1.]]) >>> s = (2,2) >>> np.ones(s) array([[1., 1.], [1., 1.]]) File: ~/anaconda3/lib/python3.9/site-packages/numpy/core/numeric.py Type: function
M = np.arange(1,13); M
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
M.reshape((3,4)), M # make a 2d array
(array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12]]), array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]))
N = M.reshape((3,4)); N # a 2d array
array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12]])
N.reshape(12) #back to M
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
N.ravel() #works more generally - watch the videos
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
N.ravel(order='F') #fortran order - column by column
array([ 1, 5, 9, 2, 6, 10, 3, 7, 11, 4, 8, 12])
N.transpose()
array([[ 1, 5, 9], [ 2, 6, 10], [ 3, 7, 11], [ 4, 8, 12]])
N.T
array([[ 1, 5, 9], [ 2, 6, 10], [ 3, 7, 11], [ 4, 8, 12]])
np.arange(5).transpose() # doesn't work
array([0, 1, 2, 3, 4])
np.arange(5).reshape((5,1)) #col vector
array([[0], [1], [2], [3], [4]])
this can be complicated
np.arange(20)[3:-3] # one dimensional so works as with a list
array([ 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16])
N # but this is a 2d array
array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
N[:2] #first 2 rows
array([[0, 1, 2, 3], [4, 5, 6, 7]])
N[0,0], N[0,1], N[1,0], N[-1,-1] # some elements
(0, 1, 4, 11)
N[:, 1:3] #2nd and 3rd columns
array([[ 1, 2], [ 5, 6], [ 9, 10]])
N[1,:] #2nd row
array([4, 5, 6, 7])
N[:,1] #2nd col
array([1, 5, 9])
N[:2,:2] #2x2 block
array([[0, 1], [4, 5]])
N[1:,2:] #2x2 block
array([[ 6, 7], [10, 11]])
A = np.zeros((4,4))
A
array([[0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.]])
A[:2, :2] = 1
A[2:, 2:] = 3
A[:2, 2:] = 2
A
array([[1., 1., 2., 2.], [1., 1., 2., 2.], [0., 0., 3., 3.], [0., 0., 3., 3.]])
B = np.ones_like(A)
B
array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]])
B[1:,1:] = A[1:,1:]
B
array([[1., 1., 1., 1.], [1., 1., 2., 2.], [1., 0., 3., 3.], [1., 0., 3., 3.]])
A[1] = A[1] + A[2] #add rows
A
array([[ 1., 1., 2., 2.], [ 1., 1., 11., 11.], [ 0., 0., 3., 3.], [ 0., 0., 3., 3.]])
A[:,0] *= 2 #multiply column by scalar
A
array([[64., 1., 2., 2.], [64., 1., 11., 11.], [ 0., 0., 3., 3.], [ 0., 0., 3., 3.]])
N
array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
2*N # scalar multiplication
array([[ 0, 2, 4, 6], [ 8, 10, 12, 14], [16, 18, 20, 22]])
np.ones_like(N)
array([[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]])
N + np.ones_like(N) #addition
array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12]])
This uses an advanced trick called broadcasting to do the same thing.
Basically the 1 is automatically converted to np.ones_like(N)
N + 1
array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12]])
N[:,:-1] + np.identity(3)
array([[ 1., 1., 2.], [ 4., 6., 6.], [ 8., 9., 11.]])
N
array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
N*N
array([[ 0, 1, 4, 9], [ 16, 25, 36, 49], [ 64, 81, 100, 121]])
np.sqrt(N*N)
array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]])
np.sqrt(np.arange(10))
array([0. , 1. , 1.41421356, 1.73205081, 2. , 2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ])
def f(x):
return np.log(2*x + 1)
f(np.arange(10))
array([0. , 1.09861229, 1.60943791, 1.94591015, 2.19722458, 2.39789527, 2.56494936, 2.7080502 , 2.83321334, 2.94443898])
%%timeit
f(np.arange(1000))
26.4 µs ± 2.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
[f(x) for x in np.arange(1000)]
2.23 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
g = np.vectorize(f)
%%timeit
g(np.arange(1000))
1.18 ms ± 22.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Remember: $ v.v = \| v\|^2$
v = np.arange(5)
v
array([0, 1, 2, 3, 4])
np.dot(v,v)
30
this notation is probably better because it avoids parenthisis
v @ v
30
np.linalg.norm(v) ** 2
30.0
v @ np.ones_like(v)
10
np.sum(v)
10
R = np.ones((2,2))
R[0,1] *= -1
R
array([[ 1., -1.], [ 1., 1.]])
R @ np.array([1,0]), R @ np.array([0,1])
(array([1., 1.]), array([-1., 1.]))
R @ R
array([[ 0., -2.], [ 2., 0.]])
we'll use matplotlib to draw some graphs.
Matplotlib can be very complicated so we'll study it properly next week.
import matplotlib.pyplot as plt
import numpy as np
def f(x):
return np.log(2*x + 1)
plt.plot(f(np.arange(10)) );
T = np.linspace(0, 2*np.pi, 20)
T
array([0. , 0.33069396, 0.66138793, 0.99208189, 1.32277585, 1.65346982, 1.98416378, 2.31485774, 2.64555171, 2.97624567, 3.30693964, 3.6376336 , 3.96832756, 4.29902153, 4.62971549, 4.96040945, 5.29110342, 5.62179738, 5.95249134, 6.28318531])
? np.linspace
Signature: np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0) Docstring: Return evenly spaced numbers over a specified interval. Returns `num` evenly spaced samples, calculated over the interval [`start`, `stop`]. The endpoint of the interval can optionally be excluded. .. versionchanged:: 1.16.0 Non-scalar `start` and `stop` are now supported. Parameters ---------- start : array_like The starting value of the sequence. stop : array_like The end value of the sequence, unless `endpoint` is set to False. In that case, the sequence consists of all but the last of ``num + 1`` evenly spaced samples, so that `stop` is excluded. Note that the step size changes when `endpoint` is False. num : int, optional Number of samples to generate. Default is 50. Must be non-negative. endpoint : bool, optional If True, `stop` is the last sample. Otherwise, it is not included. Default is True. retstep : bool, optional If True, return (`samples`, `step`), where `step` is the spacing between samples. dtype : dtype, optional The type of the output array. If `dtype` is not given, infer the data type from the other input arguments. .. versionadded:: 1.9.0 axis : int, optional The axis in the result to store the samples. Relevant only if start or stop are array-like. By default (0), the samples will be along a new axis inserted at the beginning. Use -1 to get an axis at the end. .. versionadded:: 1.16.0 Returns ------- samples : ndarray There are `num` equally spaced samples in the closed interval ``[start, stop]`` or the half-open interval ``[start, stop)`` (depending on whether `endpoint` is True or False). step : float, optional Only returned if `retstep` is True Size of spacing between samples. See Also -------- arange : Similar to `linspace`, but uses a step size (instead of the number of samples). geomspace : Similar to `linspace`, but with numbers spaced evenly on a log scale (a geometric progression). logspace : Similar to `geomspace`, but with the end points specified as logarithms. Examples -------- >>> np.linspace(2.0, 3.0, num=5) array([2. , 2.25, 2.5 , 2.75, 3. ]) >>> np.linspace(2.0, 3.0, num=5, endpoint=False) array([2. , 2.2, 2.4, 2.6, 2.8]) >>> np.linspace(2.0, 3.0, num=5, retstep=True) (array([2. , 2.25, 2.5 , 2.75, 3. ]), 0.25) Graphical illustration: >>> import matplotlib.pyplot as plt >>> N = 8 >>> y = np.zeros(N) >>> x1 = np.linspace(0, 10, N, endpoint=True) >>> x2 = np.linspace(0, 10, N, endpoint=False) >>> plt.plot(x1, y, 'o') [<matplotlib.lines.Line2D object at 0x...>] >>> plt.plot(x2, y + 0.5, 'o') [<matplotlib.lines.Line2D object at 0x...>] >>> plt.ylim([-0.5, 1]) (-0.5, 1) >>> plt.show() File: ~/anaconda3/lib/python3.6/site-packages/numpy/core/function_base.py Type: function
Y = np.sin(T)
X = np.cos(T)
plt.plot(T,X*X + 3*Y)
plt.plot(T,Y);
plt.plot(X,Y);
S = np.linspace(0, 2*np.pi, 400)
plt.plot(np.cos(5*S), np.cos(13*S));
M = np.identity(3)
M[0] = np.arange(1,4)
M[1,2] = 4
M
array([[1., 2., 3.], [0., 1., 4.], [0., 0., 1.]])
np.linalg.det(M)
1.0
np.linalg.inv(M)
array([[ 1., -2., 5.], [ 0., 1., -4.], [ 0., 0., 1.]])
np.dot(M,_)
array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
np.linalg.solve(M, np.ones(3))
array([ 4., -3., 1.])
np.dot(M, _)
array([1., 1., 1.])
np.dot(np.linalg.inv(M), np.ones(3))
array([ 4., -3., 1.])
A = np.arange(9).reshape(3,3)
A
array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
np.linalg.det(A)
0.0
A[0] - 2* A[1] + A[2]
array([0, 0, 0])
vps, basis = np.linalg.eig(A)
vps #eigenvalues
array([ 1.33484692e+01, -1.34846923e+00, -2.48477279e-16])
vps[2]
-2.484772788451793e-16
basis
array([[ 0.16476382, 0.79969966, 0.40824829], [ 0.50577448, 0.10420579, -0.81649658], [ 0.84678513, -0.59128809, 0.40824829]])
np.dot(A, basis)
array([[ 2.19934474e+00, -1.07837039e+00, -3.33066907e-16], [ 6.75131503e+00, -1.40518298e-01, -8.88178420e-16], [ 1.13032853e+01, 7.97333791e-01, -1.33226763e-15]])
the third column of the basis is in the kernel
np.dot(A, basis[:,2])
array([-3.33066907e-16, -4.44089210e-16, -8.88178420e-16])
basis[:,2]/basis[0,2]
array([ 1., -2., 1.])
A[0] - 2* A[1] + A[2]
array([0, 0, 0])
from the characteristic polynomial
$$t \times(t^2 -12t -18)$$np.poly(A).astype(np.int32) # coeficients of characteristic polynomial
array([ 1, -12, -18, 0], dtype=int32)
disc = np.sqrt(12**2 + 4*18)
disc, 6*np.sqrt(6)
(14.696938456699069, 14.696938456699067)
(12 + disc )/2, (12 - disc )/2
(13.348469228349535, -1.3484692283495345)
np.linalg.eigvals
to find the biggest eigenvalue for the matricesimport random
from numpy.linalg import det
from numpy.random import randint
from numpy.linalg import eigvals
#? np.random.randint
X = [ 1 for k in range(100000) if abs(det( randint(0,2,9).reshape(3,3))) < .5 ]
len(X)
66019
eigvals( 2*np.random.rand(2,2) -1)
array([0.22865269+0.20750086j, 0.22865269-0.20750086j])
? np.linalg.eigvals
Signature: np.linalg.eigvals(a) Docstring: Compute the eigenvalues of a general matrix. Main difference between `eigvals` and `eig`: the eigenvectors aren't returned. Parameters ---------- a : (..., M, M) array_like A complex- or real-valued matrix whose eigenvalues will be computed. Returns ------- w : (..., M,) ndarray The eigenvalues, each repeated according to its multiplicity. They are not necessarily ordered, nor are they necessarily real for real matrices. Raises ------ LinAlgError If the eigenvalue computation does not converge. See Also -------- eig : eigenvalues and right eigenvectors of general arrays eigvalsh : eigenvalues of real symmetric or complex Hermitian (conjugate symmetric) arrays. eigh : eigenvalues and eigenvectors of real symmetric or complex Hermitian (conjugate symmetric) arrays. Notes ----- .. versionadded:: 1.8.0 Broadcasting rules apply, see the `numpy.linalg` documentation for details. This is implemented using the ``_geev`` LAPACK routines which compute the eigenvalues and eigenvectors of general square arrays. Examples -------- Illustration, using the fact that the eigenvalues of a diagonal matrix are its diagonal elements, that multiplying a matrix on the left by an orthogonal matrix, `Q`, and on the right by `Q.T` (the transpose of `Q`), preserves the eigenvalues of the "middle" matrix. In other words, if `Q` is orthogonal, then ``Q * A * Q.T`` has the same eigenvalues as ``A``: >>> from numpy import linalg as LA >>> x = np.random.random() >>> Q = np.array([[np.cos(x), -np.sin(x)], [np.sin(x), np.cos(x)]]) >>> LA.norm(Q[0, :]), LA.norm(Q[1, :]), np.dot(Q[0, :],Q[1, :]) (1.0, 1.0, 0.0) Now multiply a diagonal matrix by ``Q`` on one side and by ``Q.T`` on the other: >>> D = np.diag((-1,1)) >>> LA.eigvals(D) array([-1., 1.]) >>> A = np.dot(Q, D) >>> A = np.dot(A, Q.T) >>> LA.eigvals(A) array([ 1., -1.]) # random File: ~/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py Type: function
Images can be manipulated like matrix. I learned how to here.
You can use any image you like but please start by dowloading this one
import imageio
import matplotlib.pyplot as plt
im = imageio.imread('https://macbuse.github.io/PROG/anonymous.png')
im.max()
255
im.shape
(769, 556)
plt.imshow(im);
im[100:150,:] = 255
im[600:650,:] = 178
plt.imshow(im);
im[100:150,200:400] = 0
plt.imshow(im);
color images are 3 layers
each layer is a 2x2 matrix of floats between 0 and 1
col_im = np.array((im.T/255.,)*3).T #duplicate the layer and convert to floats
plt.imshow(col_im);
R,G,B = col_im.T
G *= 0
B *= 0
plt.imshow(col_im);
col_im = np.stack((im.T/255.,)*3).T #duplicate the layer and convert to floats
R,G,B = np.copy(col_im).T
G *= .3
plt.imshow(np.stack((R,G,B)).T);
you can resize (resample) an image but you should really use scipy.transform.resize
plt.imshow(im[::2,:]);
plt.imshow(im[:,::2]);
plt.imshow(im[::2,::2]);
there are a lot of other things you can do
xx = np.copy(im)
xx[::5,:] = 0
plt.imshow(xx);
yy = im[::2,::2]
dy, dx = yy.shape
pp = np.zeros((2*dy,2*dx), dtype=np.uint8)
pp[:dy,:dx] = yy
pp[:dy,dx:] = yy//4
pp[dy:,:dx] = yy//2
plt.imshow(pp);
rather than writing blocks
plt.imshow(np.concatenate((yy,yy), axis=1 ));
plt.imshow(np.concatenate((yy, .3*yy), axis=0 ));
tmp = np.concatenate((yy,.5*yy,.25*yy), axis = 1)
plt.imshow( np.concatenate((tmp,.5*tmp,.25*tmp)));
mary = imageio.imread('./marilyn.jpg')
plt.imshow(mary);
mary.shape
(1024, 1024, 3)
sx = 1024//3
mm = mary[sx:2*sx,sx:2*sx]
plt.imshow(mm);