# Tutor Class 25/11/2024

#### Eleonora Cicciarella (eleonora.cicciarella@phd.unipd.it)

## Numpy 

Link to the documentation: https://numpy.org/

In [160]:
import numpy as np

### What is Numpy?

NumPy (*Numerical Python*) is the fundamental package for *scientific computing* in Python. It is a Python library that provides a *multidimensional array object*, various derived objects (such as masked arrays and matrices), and an assortment of routines for **fast operations** on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more. 

### Why Numpy?

Two of NumPy’s features are the basis of much of its power: *vectorization* and *broadcasting*.

1. Vectorization refers to **applying operations to arrays** instead of just individual elements (i.e.**absence of any explicit looping**).

Why vectorize?
- Much faster
- Easier to read and fewer lines of code
- More closely resembles mathematical notation

2. Broadcasting is the term used to describe the implicit **element-by-element behavior of operations**; generally speaking, in NumPy all operations, not just arithmetic operations, but logical, bit-wise, functional, etc., behave in this implicit element-by-element fashion, i.e., they broadcast (see https://numpy.org/doc/stable/user/basics.broadcasting.html#basics-broadcasting for further clarifications).

In [161]:
%%timeit
a = list(range(100000))
b = list(range(100000))

for _ in range(10):
    c = []
    for i in range(len(a)):
        c.append(a[i] + 2 * b[i])

62.1 ms ± 2.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [162]:
%%timeit
a = np.arange(100000)
b = np.arange(100000)

for _ in range(10):
    c = a + 2 * b

596 µs ± 23.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


### The N-dimensional array (*ndarray*)

An ndarray is a *multidimensional* container of items of the *same type and size*. The number of dimensions and items in an array is defined by its *shape*, which is a tuple of N non-negative integers that specify the sizes of each dimension. The type of items in the array is specified by a separate data-type object (dtype), one of which is associated with each ndarray.

As with other container objects in Python, the contents of an ndarray can be accessed and modified by indexing or slicing the array (using, for example, N integers), and via the methods and attributes of the ndarray.

In [164]:
# Can initialize ndarrays with Python lists, for example:
a = np.array([1, 2, 3])   # Create a rank 1 array
print('dtype:', a.dtype)  
print('shape:', a.shape)  
        

b = np.array([[1, 2, 3],
              [4, 5, 6]], dtype='float64')  # Create a rank 2 array
print('dtype:', b.dtype) 
print('shape:', b.shape)          
print(b)  

dtype: int32
shape: (3,)
dtype: float64
shape: (2, 3)
[[1. 2. 3.]
 [4. 5. 6.]]


See https://numpy.org/doc/stable/reference/arrays.ndarray.html#array-methods for all the ndarray methods.

There are many other array initializations that NumPy provides:

In [165]:
a = np.zeros((2, 2))   # Create an array of all zeros
print(a)               #


b = np.full((2, 2), 7)  # Create a constant array
print(b)               
                        

c = np.eye(2)         # Create a 2 x 2 identity matrix
print(c)              
                     

d = np.random.random((2, 2))  # Create an array filled with random values
print(d)                      
                              

[[0. 0.]
 [0. 0.]]
[[7 7]
 [7 7]]
[[1. 0.]
 [0. 1.]]
[[0.17092451 0.42110865]
 [0.12745882 0.39882955]]


#### Shapes

In [167]:
nums = np.arange(8)
print(f"Original array:\n{nums}")
print(f"Original shape: {nums.shape}\n")

nums = nums.reshape((2, 4))
print(f'Reshaped array:\n{nums}')
print(f"New shape: {nums.shape}\n")

# The -1 in reshape corresponds to an unknown dimension that numpy will figure out,
# based on all other dimensions and the array size.
# Can only specify one unknown dimension.
# For example, sometimes we might have an unknown number of data points, and
# so we can use -1 instead, without worrying about the true number.
nums = nums.reshape((-1, 4))
print(f'Reshaped with -1:\n{nums}')
print(f"New shape: {nums.shape}\n")

# You can also flatten the array by using -1 reshape
flatten_nums = nums.reshape(-1)
print(f'Flatten array:\n{flatten_nums}') 
print(f'Flatten shape: {flatten_nums.shape}')

Original array:
[0 1 2 3 4 5 6 7]
Original shape: (8,)

Reshaped array:
[[0 1 2 3]
 [4 5 6 7]]
New shape: (2, 4)

Reshaped with -1:
[[0 1 2 3]
 [4 5 6 7]]
New shape: (2, 4)

Flatten array:
[0 1 2 3 4 5 6 7]
Flatten shape: (8,)


What happens if we try to reshape the array with an invalid shape?

In [168]:
nums = nums.reshape((5, 2))

ValueError: cannot reshape array of size 8 into shape (5,2)

### Array Operations/Math

In [170]:
x = np.array([[1, 2],
              [3, 4]], dtype=np.float32)
y = np.array([[5, 6],
              [7, 8]], dtype=np.float64)

# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print(np.array_equal(x + y, np.add(x, y)))

# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(np.array_equal(x - y, np.subtract(x, y)))

# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(np.array_equal(x * y, np.multiply(x, y)))

# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(np.array_equal(x / y, np.divide(x, y)))

# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

True
True
True
True
[[1.        1.4142135]
 [1.7320508 2.       ]]


**Note**: * is *elementwise multiplication*, not matrix multiplication. We instead use the <code> dot</code> function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. <code> dot</code> is available both as a function in the numpy module and as an instance method of array objects:

In [171]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])

v = np.array([9, 10])
w = np.array([11, 12])

print("Inner product of vectors")
print(v.dot(w))
print(np.dot(v, w))

print("\nMatrix - vector product")
print(x.dot(v))
print(np.dot(x, v))

print("\nMatrix - matrix product")
print(x.dot(y))
print(np.dot(x, y))

Inner product of vectors
219
219

Matrix - vector product
[29 67]
[29 67]

Matrix - matrix product
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]


There are many useful functions built into NumPy, and often we're able to express them across specific axes of the ndarray:

In [172]:
x = np.array([[1, 2, 3], 
              [4, 5, 6]])

print(f"Array:\n{x}")

print(f"Sum of all elements: {np.sum(x)}")          # Compute sum of all elements
print(f"Sum among columns: {np.sum(x, axis=0)}")    # Compute sum of each column
print(f"Sum among rows: {np.sum(x, axis=1)}")       # Compute sum of each row

print(f"Max among rows: {np.max(x, axis=1)}")       # Compute max of each row

Array:
[[1 2 3]
 [4 5 6]]
Sum of all elements: 21
Sum among columns: [5 7 9]
Sum among rows: [ 6 15]
Max among rows: [3 6]


We can find indices of elements that satisfy some conditions by using `np.where`

In [175]:
x

array([[1, 2, 3],
       [4, 5, 6]])

In [176]:
#print(np.where(x >= 4))
print(x[np.where(x >= 4)])

[4 5 6]


#### Broadcasting

In [177]:
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
print(f"Array x:\n{x}")
print(f"Shape of x: {x.shape}")

v = np.array([1, 0, 1])
print(f"\nArray v:\n{v}")
print(f"Shape of v: {v.shape}")

Array x:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
Shape of x: (4, 3)

Array v:
[1 0 1]
Shape of v: (3,)


In [178]:
y = x + v  # Add v to each row of x using broadcasting
print(f"Array y:\n{y}")
print(f"Shape of y: {y.shape}")

Array y:
[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]
Shape of y: (4, 3)


The line `y = x + v` works even though $x$ has shape `(4, 3)` and $v$ has shape `(3,)` due to broadcasting; this line works as if $v$ actually had shape `(4, 3)`, where each row was a copy of $v$, and the sum was performed **elementwise**.

#### 1.  Create a 3x3 matrix with values ranging from 1 to 9

In [198]:
# 1 line of code
# hint: use arange and then reshape
x = np.arange(1, 10)
#np.arange(9)+1
x.reshape((3,3))

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Expected output 

```
[[1 2 3]
 [4 5 6]
 [7 8 9]]
```

#### 2. Replace the maximum value(s) of matrix Z by 0

In [202]:
Z = np.array([[1, 5, 7], [-1, 7, 3]])
print(Z)

[[ 1  5  7]
 [-1  7  3]]


In [204]:
# 1 line of code
# hint: use where and max
np.where(Z!=np.max(Z), Z, 0)

array([[ 1,  5,  0],
       [-1,  0,  3]])

Expected output 

```
[[ 1  5  0]
 [-1  0  3]]
```

#### 3. Subtract the mean of each row of matrix X

In [206]:
X = np.array([[1,2,3], [4,6,8], [5,10,15]])
print(X)

[[ 1  2  3]
 [ 4  6  8]
 [ 5 10 15]]


In [207]:
# 1 line of code
# use np.mean() - pay attention to axis=.. and keepdims=...
X-X.mean(axis=1, keepdims=True)

array([[-1.,  0.,  1.],
       [-2.,  0.,  2.],
       [-5.,  0.,  5.]])

Expected output 

```
[[-1.  0.  1.]
 [-2.  0.  2.]
 [-5.  0.  5.]]
```

#### 4. Sort the array X by the first column

In [208]:
X = np.array([[1,-2,7], [-4,0,10], [10,8,5]])
print(X)

[[ 1 -2  7]
 [-4  0 10]
 [10  8  5]]


In [None]:
# 1 line of code
# hint: use argsort - not on the entire array
# but remember that argsort returns the indices that sort the array

X[np.argsort(X[:,0]), :] # equivalently: X[np.argsort(X[:,0])]

array([[-4,  0, 10],
       [ 1, -2,  7],
       [10,  8,  5]])

Expected output 

```
[[-4  0 10]
 [ 1 -2  7]
 [10  8  5]]
```

#### 5. Find the value(s) in array X nearest to z = 3.5

In [211]:
X = np.array([[1,-2,7], [-4,0,10], [10,8,5]])
print(X)

[[ 1 -2  7]
 [-4  0 10]
 [10  8  5]]


In [213]:
# hint: use np.abs(X - z) and then argmin
# remember to flatten or reshape X to obtain the nearest value
z= 3.5
X.reshape(-1)[np.argmin(np.abs(X-z))] #equivalently: X.flatten()[np.argmin(np.abs(X-z))]

5

Expected output: `5`

#### 6. Implement the sigmoid function

$$\sigma(x) = \frac{e^x}{1+e^{x}}$$

**Note**: $\sigma(\cdot)$ is a scalar function. If we apply it to a vector $\bold{x}=(x_1,\dots,x_n)$, it means that we are applying the function entry-wise, i.e., to every element $x_i$ of the vector.

In [216]:
def sigmoid(x):
    # hint: use np.exp
    sigmoid = np.exp(x)/(1+np.exp(x))
    
    return sigmoid


In [217]:
x = np.array([1,-1,0])

sigmoid(x)

array([0.73105858, 0.26894142, 0.5       ])

Expected output: `array([0.73105858, 0.26894142, 0.5])`

#### 7. Implement `image2vector()` that takes an input of shape `(width, height, 3)` and returns a vector of shape `(width*height*3, 1)`.

In [220]:
def image2vector(image):
    # hint: try to use np.prod(image.shape)
    v = image.reshape((np.prod(image.shape),1))
    return v

In [221]:
# This is a 3 by 3 by 2 array, typically images will be (num_px_x, num_px_y, 3) where 3 represents the RGB values
image = np.array( [[[ 0.67826139,  0.29380381],
                    [ 0.90714982,  0.52835647],
                    [ 0.4215251 ,  0.45017551]],

                   [[ 0.92814219,  0.96677647],
                    [ 0.85304703,  0.52351845],
                    [ 0.19981397,  0.27417313]],

                   [[ 0.60659855,  0.00533165],
                    [ 0.10820313,  0.49978937],
                    [ 0.34144279,  0.94630077]]])

print (f"image2vector(image) = {image2vector(image)}")
print (f"output shape = {image2vector(image).shape}")

image2vector(image) = [[0.67826139]
 [0.29380381]
 [0.90714982]
 [0.52835647]
 [0.4215251 ]
 [0.45017551]
 [0.92814219]
 [0.96677647]
 [0.85304703]
 [0.52351845]
 [0.19981397]
 [0.27417313]
 [0.60659855]
 [0.00533165]
 [0.10820313]
 [0.49978937]
 [0.34144279]
 [0.94630077]]
output shape = (18, 1)


Expected output: 
```
image2vector(image) = [[0.67826139]
 [0.29380381]
 [0.90714982]
 [0.52835647]
 [0.4215251 ]
 [0.45017551]
 [0.92814219]
 [0.96677647]
 [0.85304703]
 [0.52351845]
 [0.19981397]
 [0.27417313]
 [0.60659855]
 [0.00533165]
 [0.10820313]
 [0.49978937]
 [0.34144279]
 [0.94630077]]
output shape = (18, 1)
```

#### 8. Implement `standardize_rows()` to standardize the rows of a matrix, by subtracting the mean and dividing by the standard deviation.
$$X_{\text{normalized}} = \frac{X -\mu}{\sigma}$$

where $\mu$ is the meand and $\sigma$ is the standard deviation.

In [222]:
def standardize_rows(X):
    # hint: pay attention to the axis and the keepdims
    X = (X-X.mean(axis=1, keepdims=True)) / X.std(axis=1, keepdims=True)
    return X


In [223]:
X = np.array([[0, 3, 4],
              [1, 6, 4]])

print(f"normalizeRows(X) =\n{standardize_rows(X)} ")

normalizeRows(X) =
[[-1.37281295  0.39223227  0.98058068]
 [-1.29777137  1.13554995  0.16222142]] 


Expected output:

```
normalizeRows(X) =
[[-1.37281295  0.39223227  0.98058068]
 [-1.29777137  1.13554995  0.16222142]] 

 ```

#### 9. Implement the softmax function

$$\text{softmax}(y_i) = \frac{e^{y_i}}{\sum_{j=0}^n e^{y_j}}$$

In [224]:
def softmax(x):
    # 2 lines of code
    # hint: pay attention to axis and keepdims
    x_exp = np.exp(x)
    s = x_exp / np.sum(x_exp, axis=1, keepdims=True)

    return s

In [225]:
x = np.array([[9, 2, 5, 0, 0],
              [7, 5, 0, 0 ,0]])

print(f"softmax(x) =\n{softmax(x)}")

softmax(x) =
[[9.80897665e-01 8.94462891e-04 1.79657674e-02 1.21052389e-04
  1.21052389e-04]
 [8.78679856e-01 1.18916387e-01 8.01252314e-04 8.01252314e-04
  8.01252314e-04]]


Expected output: 

```
softmax(x) =
[[9.80897665e-01 8.94462891e-04 1.79657674e-02 1.21052389e-04
  1.21052389e-04]
 [8.78679856e-01 1.18916387e-01 8.01252314e-04 8.01252314e-04
  8.01252314e-04]]

```

#### 10.  Mean Absolute Error or $L^1$ loss function

$$\text{MAE}(y, \hat{y}) = \sum_{i=1}^n |y_i - \hat{y}_i|$$

In [226]:
def L1(yhat, y):
    loss = np.sum(np.abs(y-yhat))
    return loss


In [227]:
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])

print("L1 = " + str(L1(yhat,y)))

L1 = 1.1


Expected output: `L1 = 1.1`


#### 11.  Mean Squared Error or $L^2$ loss function

$$\text{MSE}(y,\hat{y}) = \sum_{i=1}^n (y_i - \hat{y}_i)^2$$

In [228]:
def L2(yhat, y):
    loss = np.sum((yhat - y)**2)
    return loss


In [229]:
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])

print("L2 = " + str(L2(yhat,y)))

L2 = 0.43


Expected output: `L2 = 0.43
`