{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ci4OMRfxs3kt"
   },
   "source": [
    "# Keras Tutorial #\n",
    "# Convolutional Neural Networks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "HOx-NATVs3kz"
   },
   "source": [
    "## Introduction\n",
    "\n",
    "In this tutorial, you are going to implement a simple Convolutional Neural Network in Keras.\n",
    "\n",
    "Convolutional Networks work by moving small filters across the input image. This means the filters are re-used for recognizing patterns throughout the entire input image. This makes the Convolutional Networks much more powerful than Fully Connected networks with the same number of variables. This in turn makes the Convolutional Networks faster to train."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "DJNbGiCCbz4L"
   },
   "source": [
    "If you are using Google Colab, run the following cell to mount your Google Drive"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 1476,
     "status": "ok",
     "timestamp": 1608040451542,
     "user": {
      "displayName": "Marco Toldo",
      "photoUrl": "",
      "userId": "10123192063759507523"
     },
     "user_tz": -60
    },
    "id": "Q8sa5cl3LwY_",
    "outputId": "bece965f-3a7b-4c9c-c18d-e004d1a2c27e"
   },
   "outputs": [],
   "source": [
    "if 'google.colab' in str(get_ipython()):\n",
    "    print('Running on CoLab')\n",
    "    import os\n",
    "    from google.colab import drive\n",
    "    drive.mount('/content/drive')\n",
    "    predefined_path = 'drive/MyDrive'  \n",
    "    my_path = ''  # place the path to your notebook in Google Drive\n",
    "    os.chdir(os.path.join(predefined_path, my_path))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "s0dE0Oqys3k0"
   },
   "source": [
    "## Flowchart\n",
    "The following chart shows the structure of the Convolutional Neural Network implemented below.\n",
    "\n",
    "![](https://drive.google.com/uc?id=140yiEyYl4rr6_zLRzHzpnWqukOuQCh3j)\n",
    "The input image is processed in the first convolutional layer using the filter-weights. This results in 16 new images, one for each filter in the convolutional layer. The images are also downsampled thus reducing the image resolution from 28x28 to 14x14.\n",
    "\n",
    "These 16 smaller images are then processed in the second convolutional layer. We need filter-weights for each of these 16 channels, and we need filter-weights for each output channel of this layer. There are 36 output channels so there are a total of 16 x 36 = 576 filters in the second convolutional layer. The resulting images are downsampled again to 7x7 pixels.\n",
    "\n",
    "The output of the second convolutional layer is composed by 36 images of 7x7 pixels each. These are then flattened to a single vector of length 7 x 7 x 36 = 1764, which is used as the input to a fully connected layer with 128 neurons (or elements). This feeds into another fully connected layer with 10 neurons, one for each of the classes, which is used to determine the class of the image, that is, which number is depicted in the image.\n",
    "\n",
    "The convolutional filters are initially chosen at random, so the classification is done randomly. The error between the predicted and true class of the input image is measured using the cross-entropy. The optimizer then automatically propagates this error back through the Convolutional Network using the chain-rule of differentiation and updates the filter-weights so as to improve the classification error. This is done iteratively thousands of times until the classification error is sufficiently low.\n",
    "\n",
    "These particular filter-weights and intermediate images are the results of one optimization run and may look different if you re-run this Notebook.\n",
    "\n",
    "Note that the computation in Keras is actually done on a batch of images instead of a single image, which makes the computation more efficient. This means the flowchart actually has one more data-dimension when implemented in Keras."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9AcTwrXVs3k0"
   },
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "yNOsTFbXs3k1"
   },
   "outputs": [],
   "source": [
    "# TensorFlow and tf.keras\n",
    "import tensorflow as tf\n",
    "from tensorflow import keras\n",
    "from tensorflow.math import confusion_matrix\n",
    "\n",
    "# Helper libraries\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import math\n",
    "\n",
    "#!pip list"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "FppXAB5xs3k1"
   },
   "source": [
    "This was developed using Python 3.6 and TensorFlow version:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rxoGqcVps3k2"
   },
   "outputs": [],
   "source": [
    "tf.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "orjjyZ_zs3k2"
   },
   "source": [
    "## Load Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "KMBl2kiJs3k2"
   },
   "source": [
    "We are going to use the fashion MNIST dataset. It will be downloaded automatically if it is not located in the given path."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "dp3Qb-rvs3k2"
   },
   "outputs": [],
   "source": [
    "mnist = keras.datasets.fashion_mnist\n",
    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "jkfUyQjRs3k3"
   },
   "source": [
    "The MNIST dataset has now been loaded and consists of 70.000 images and class-numbers for the images. The dataset is split into 2 mutually exclusive sub-sets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "3gZWWUCys3k3"
   },
   "outputs": [],
   "source": [
    "print(\"Size of:\")\n",
    "print(\"- Training-set:\\t\\t{}\".format(train_images.shape[0]))\n",
    "print(\"- Test-set:\\t\\t{}\".format(test_images.shape[0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "G8m1zjoys3k3"
   },
   "source": [
    "Each image contained in the dataset is represented in a vectorized (1D) form. We will reshape them to standard 2D images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "6U_5oXg0s3k3"
   },
   "outputs": [],
   "source": [
    "# We know that MNIST images are composed by 28 pixels in each dimension.\n",
    "img_shape = 28\n",
    "\n",
    "# Reshape MNIST images \n",
    "train_images = train_images.reshape((-1, img_shape, img_shape, 1), order=\"F\")\n",
    "test_images = test_images.reshape((-1, img_shape, img_shape, 1), order=\"F\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "J3Ms7W06s3k4"
   },
   "source": [
    "The loaded images have values in the range [0; 255].\n",
    "We normalize them in the range [0; 1]."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "S8d_I-grs3k4"
   },
   "outputs": [],
   "source": [
    "# Preprocessing\n",
    "train_images = train_images / 255.0\n",
    "test_images = test_images / 255.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AQoljJF1s3k4"
   },
   "source": [
    "### Label Encoding"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Je7Gwm7os3k4"
   },
   "source": [
    "The class labels are stored as a list of integers ranging from 0 to 9. These are the numbers associated to the images. The $i-th$ element in train_labels is the label related to the $i-th$ image in train_images (train_images[i]). test_images and test_labels have a similar structure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rT41j3xMs3k4"
   },
   "outputs": [],
   "source": [
    "# Number of classes and relative names\n",
    "num_classes = 10\n",
    "class_names = ['Tsh.', 'Trous.', 'Pull.', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneak.', 'Bag', 'A.boot']\n",
    "\n",
    "train_labels[0:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "97AzhSTfs3k5"
   },
   "source": [
    "### Helper-function for plotting images"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "X4FXHRxvs3k5"
   },
   "source": [
    "Function used to plot 9 images in a 3x3 grid, and writing the true and predicted classes below each image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "BvcEFTfOs3k5"
   },
   "outputs": [],
   "source": [
    "def plot_images(images, labels, predictions=None):\n",
    "    assert len(images) == len(labels) == 9\n",
    "    # Create figure with 3x3 sub-plots.\n",
    "    fig, axes = plt.subplots(3, 3)\n",
    "    fig.subplots_adjust(hspace=0.3, wspace=0.3)\n",
    "    for i, ax in enumerate(axes.flat):\n",
    "        # Plot image.\n",
    "        ax.imshow(images[i].squeeze(), cmap='binary')\n",
    "        # Show true and predicted classes.\n",
    "        if predictions is None:\n",
    "            xlabel = \"True: {0}\".format(class_names[labels[i]])\n",
    "        else:\n",
    "            xlabel = \"True: {0}, Pred: {1}\".format(class_names[labels[i]], class_names[predictions[i]])\n",
    "        ax.set_xlabel(xlabel)\n",
    "        # Remove ticks from the plot.\n",
    "        ax.set_xticks([])\n",
    "        ax.set_yticks([])\n",
    "    # Ensure the plot is shown correctly with multiple plots\n",
    "    # in a single Notebook cell.\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "38B3ha5Ps3k5"
   },
   "source": [
    "### Plot a few images to see if data is correct"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "4nDbqGa-s3k5"
   },
   "outputs": [],
   "source": [
    "# Get the first images from the test-set.\n",
    "images = test_images[0:9]\n",
    "\n",
    "# Get the true classes for those images.\n",
    "cls_true = test_labels[0:9]\n",
    "\n",
    "# Plot the images and labels using our helper-function above.\n",
    "plot_images(images=images, labels=cls_true)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TehWyhXYs3k6"
   },
   "source": [
    "## Build the Classification Model\n",
    "\n",
    "Building the neural network requires configuring the layers of the model, then compiling it."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "iLOADpTAs3k6"
   },
   "source": [
    "### Input Tensor\n",
    "\n",
    "First, we allocate a Keras tensor that is going to contain the input data.\n",
    "\n",
    "The input is assumed to be a 3D tensor of shape [num_images, num_inputs, 1].\n",
    "The third dimension is 1 because we are dealing with grayscale images. It would be 3 for a color image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "0IbKezM5s3k6"
   },
   "outputs": [],
   "source": [
    "input = # use keras.layers.Input"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "rJmZ_B9Rs3k6"
   },
   "source": [
    "### Setup the Layers\n",
    "The presented neural network is composed by 2 convolutional layers, each one followed by a 2x2 max pooling layer, and 2 fully connected (dense) layers.\n",
    "\n",
    "Here, it is possible to change some parameters related to the layers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "SDSPt2ips3k6"
   },
   "outputs": [],
   "source": [
    "# feel free to change sizes and number of filters to see what happens !!!\n",
    "\n",
    "# Convolutional Layer 1.\n",
    "kernel_size1 = 5          # Convolution filters are 5 x 5 pixels.\n",
    "num_filters1 = 16         # There are 16 of these filters.\n",
    "\n",
    "# Convolutional Layer 2.\n",
    "filter_size2 = 5          # Convolution filters are 5 x 5 pixels.\n",
    "num_filters2 = 36         # There are 36 of these filters.\n",
    "\n",
    "# Fully connected layer.\n",
    "fc_size = 128             # Number of neurons in fully-connected layer.\n",
    "\n",
    "# The output fully connected layer is composed by 10 neurons as the number of classes we are going to use"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9MbiIUvfs3k7"
   },
   "source": [
    "Now, we can start to setup our network.\n",
    "\n",
    "The first layer that we are going to add is a convolution layer. This is going to be applied on the `input` tensor.\n",
    "It is composed by `num_filters1` different filters, each having width and height equal to `kernel_size1`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "mXYFcX-Ws3k7"
   },
   "outputs": [],
   "source": [
    "conv1 = # use keras.layers.Conv2D"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "BBxHH3ats3k7"
   },
   "source": [
    "<img src=\"https://drive.google.com/uc?id=11Aa-ZgxGxqFEX-zI1j9Sp-TZdEcUTFfM\" alt=\"Drawing\" style=\"width: 600px;\"/>\n",
    "\n",
    "Then, we wish to downsample the image so it is half the size by using 2x2 max pooling.\n",
    "\n",
    "In this way we reduce the training complexity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "W_K_8sg_s3k8"
   },
   "outputs": [],
   "source": [
    "pool1 = # use keras.layers.MaxPooling2D"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_Xw2Zz39s3k8"
   },
   "source": [
    "<img src=\"https://drive.google.com/uc?id=1eT2oHTThs8_Jn0yrd9EpNQ-0opmg0Q_V\" alt=\"Drawing\" style=\"width: 600px;\"/>\n",
    "\n",
    "Similarly, we build a second set of convolutional and pooling layers.\n",
    "\n",
    "These are going to be applied on the output of the first pooling layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "8-bjWgOds3k8"
   },
   "outputs": [],
   "source": [
    "# create second convolutional and pooling layers\n",
    "\n",
    "# ADD CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "BTr_3Iyes3k8"
   },
   "source": [
    "The convolutional layers output 3D tensors. We now wish to use these as input in a fully connected network, which requires for the tensors to be reshaped or flattened to vector.\n",
    "\n",
    "`tf.keras.layers.Flatten` transforms the format of its input from a 3D tensor ( [7, 7, 36] ), to a 1D array of 7 x 7 x 36 = 1764 elements. Think of this layer as unstacking rows of pixels in the image and lining them up. This layer has no parameters to learn; it only reformats the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Lpzwg06Xs3k8"
   },
   "outputs": [],
   "source": [
    "flat = # use keras.layers.Flatten"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "lBAfqIKss3k9"
   },
   "source": [
    "Add a fully connected layer to the network. The input is the flattened layer from the previous convolution. The number of neurons in the fully-connected layer is `fc_size`. ReLU is used so we can learn non-linear relations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "kkevlFjos3k9"
   },
   "outputs": [],
   "source": [
    "fc1 = # use keras.layers.Dense"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "aIIuoOXas3k9"
   },
   "source": [
    "<img src=\"https://drive.google.com/uc?id=10Tdxdn4nElZ_wGY9-q_LanTxZdNXHopw\" alt=\"Drawing\" style=\"width: 300px;\"/>\n",
    "\n",
    "Add another fully connected layer that outputs vectors of length 10 for determining which of the 10 classes the input image belongs to. Note that ReLU is not used in this layer. We are using the `softmax` here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "i45oiZzNs3k9"
   },
   "outputs": [],
   "source": [
    "# ADD CODE for another dense layer with softmax activation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9cpdYgNbs3k9"
   },
   "source": [
    "Finally, we build the Keras model by specifing the `input` tensor and the output of the last layer, `output`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "P2V0_kNCs3k-"
   },
   "outputs": [],
   "source": [
    "model = #use keras.models.Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3fcYnVG_IZ1l"
   },
   "source": [
    "To load pre-trained model weights from a previous training of the same model, use the `model.load_weights()` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "AtbTxUeUD_kB"
   },
   "outputs": [],
   "source": [
    "load_path = ''  # insert path to weights\n",
    "if load_path:\n",
    "    model.load_weights(load_path)\n",
    "    print('Load trained weights from {}'.format(load_path))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SKKSwaK9s3k-"
   },
   "source": [
    "We can also show the structure of the network by using the `summary()` method on the `model`. Notice how the largest part of the parameters is in the fully connected layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "UmmuKM2is3k-"
   },
   "outputs": [],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "mh-DtWThs3k-"
   },
   "source": [
    "### Compile the Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "z33pfLtks3k-"
   },
   "source": [
    "Before the model is ready for training, it needs a few more settings. These are added during the model's compile step:\n",
    "* 'Loss function' — This measures how accurate the model is during training. We want to minimize this function to \"steer\" the model in the right direction.\n",
    "* 'Metrics' — Used to monitor the training and testing steps. The following example uses accuracy, the fraction of the images that are correctly classified.\n",
    "* 'Optimizer' — This is how the model is updated based on the data it sees and its loss function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "rhPb-SK0s3k_"
   },
   "outputs": [],
   "source": [
    "# add code to compile the model, use adam optimizer with learning rate=0.001, beta_1=0.9 and beta_2=0.999)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Q2G3HLxrs3k_"
   },
   "source": [
    "### Train the Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "479Ywjt4s3k_"
   },
   "source": [
    "Training the neural network model requires the following steps:\n",
    "\n",
    "1. Feed the training data to the model. In this example, the `train_images` and `train_labels` arrays.\n",
    "2. The model learns to associate images and labels.\n",
    "3. We ask the model to make predictions about a test set. In this example, the `test_images` array. We verify that the predictions match the labels from the `test_labels` array.\n",
    "\n",
    "To start training, call the `model.fit()` method. The model is \"fit\" to the training data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "IHOUH5bMs3k_"
   },
   "outputs": [],
   "source": [
    "# ADD CODE for training using the fit function\n",
    "# use a batch_size of 128 and run for 50 epochs. You can use 20% of the data for validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "LzQUHijXs3k_"
   },
   "source": [
    "As the model trains, the loss and accuracy metrics are displayed. \n",
    "Pay attention at the use of `validation_split` to fix the ratio of the training set used as validation set."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gf-S9JCoGY92"
   },
   "source": [
    "We can retrieve loss and accuracy values directly from the output of `model.fit()` and plot them to check the training progress over the epochs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 367
    },
    "executionInfo": {
     "elapsed": 147862,
     "status": "ok",
     "timestamp": 1608040598119,
     "user": {
      "displayName": "Marco Toldo",
      "photoUrl": "",
      "userId": "10123192063759507523"
     },
     "user_tz": -60
    },
    "id": "_mih-HXu8Quq",
    "outputId": "aede5a5d-76a8-42b7-b92e-93cfe11dad38"
   },
   "outputs": [],
   "source": [
    "fig, axes = plt.subplots(1, 2, figsize=(15,5))\n",
    "\n",
    "\n",
    "print('History data: '+', '.join(history.history.keys()))\n",
    "key_list = list(history.history.keys())\n",
    "\n",
    "axes[0].plot(history.history[key_list[2]], label=key_list[2])\n",
    "axes[0].plot(history.history[key_list[0]], label='train_' + key_list[0])\n",
    "axes[0].grid('both')\n",
    "axes[0].set_title('Loss')\n",
    "axes[0].set_xlabel('Epoch')\n",
    "axes[0].legend()\n",
    "\n",
    "axes[1].plot(history.history[key_list[3]], label=key_list[3])\n",
    "axes[1].plot(history.history[key_list[1]], label='train_' + key_list[1])\n",
    "axes[1].grid('both')\n",
    "axes[1].set_title('Accuracy')\n",
    "axes[1].set_xlabel('Epoch')\n",
    "axes[1].legend()\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TXx7CUM2HdCX"
   },
   "source": [
    "To save the model weights after training, use the `model.save_weights()` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 148069,
     "status": "ok",
     "timestamp": 1608040598337,
     "user": {
      "displayName": "Marco Toldo",
      "photoUrl": "",
      "userId": "10123192063759507523"
     },
     "user_tz": -60
    },
    "id": "frI2yDosB_Ye",
    "outputId": "f8fb4346-8f15-4dd4-ad03-9579d36a59a8"
   },
   "outputs": [],
   "source": [
    "save_path = 'log/model_ep{}'.format(len(history.history['loss']))  # set path where to save weights\n",
    "if save_path:\n",
    "    model.save_weights(save_path)\n",
    "    print('Saved trained weights at {}'.format(save_path))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "KrD9OISOs3k_"
   },
   "source": [
    "### Helper-functions to show performance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "TQZGFh1Ns3k_"
   },
   "outputs": [],
   "source": [
    "def print_confusion_matrix(model, images, labels):\n",
    "    num_classes = 10\n",
    "    # Get the predicted classifications for the test-set.\n",
    "    predictions = model.predict(images)\n",
    "    # Get the true classifications for the test-set.\n",
    "\n",
    "    # Get the confusion matrix using sklearn.\n",
    "    cm = confusion_matrix(labels, np.argmax(predictions,axis=1))\n",
    "    # Print the confusion matrix as text.\n",
    "    print(cm)\n",
    "    # Plot the confusion matrix as an image.\n",
    "    plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)\n",
    "    # Make various adjustments to the plot.\n",
    "    plt.tight_layout()\n",
    "    plt.colorbar()\n",
    "    tick_marks = np.arange(num_classes)\n",
    "    plt.xticks(tick_marks, class_names, rotation=45)\n",
    "    plt.yticks(tick_marks, class_names, rotation=0)\n",
    "    plt.ylim(num_classes-0.5,-0.5)\n",
    "    plt.xlabel('Predicted')\n",
    "    plt.ylabel('True')\n",
    "    # Ensure the plot is shown correctly with multiple plots\n",
    "    # in a single Notebook cell.\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Z1HuITEGs3lA"
   },
   "source": [
    "Function for plotting examples of images from the test set that have been mis-classified."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Fe9q1vw3s3lA"
   },
   "outputs": [],
   "source": [
    "def plot_example_errors(model, images, labels):\n",
    "    # Get the predicted classifications for the test-set.\n",
    "    predictions = model.predict(images)\n",
    "\n",
    "    predictions_in = np.argmax(predictions, axis=1)\n",
    "    correct = (predictions_in == labels)\n",
    "\n",
    "    # Negate the boolean array.\n",
    "    incorrect = (correct == False)\n",
    "    # Get the images from the test-set that have been\n",
    "    # incorrectly classified.\n",
    "    images = images[incorrect]\n",
    "    # Get the predicted classes for those images.\n",
    "    cls_pred = predictions_in[incorrect]\n",
    "    # Get the true classes for those images.\n",
    "    cls_true = labels[incorrect]\n",
    "    # Plot the first 9 images.\n",
    "    plot_images(images=images[0:9],\n",
    "    labels=cls_true[0:9],\n",
    "    predictions=cls_pred[0:9])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "DoDDza4vs3lA"
   },
   "source": [
    "## Performance after 5 epochs\n",
    "\n",
    "After 5 epochs, the model only mis-classifies a very few images. As demonstrated below, some of the mis-classifications are justified because the images are very hard to determine with certainty even for humans, while few examples are quite obvious but nevertheless the proposed 4-layer network fails. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "9bcKL0qgs3lA"
   },
   "outputs": [],
   "source": [
    "test_loss, test_acc = # use model.evaluate\n",
    "print('Test accuracy:', test_acc)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "My7uuiuVs3lA"
   },
   "outputs": [],
   "source": [
    "plot_example_errors(model, test_images, test_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nApGriSEs3lA"
   },
   "source": [
    "We can also print and plot the so-called confusion matrix which lets us see more details about the mis-classifications. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "q7-9wkqSs3lB"
   },
   "outputs": [],
   "source": [
    "print_confusion_matrix(model, test_images, test_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "KgKeuerqs3lB"
   },
   "source": [
    "## Visualization of Weights and Layers\n",
    "In trying to understand why the convolutional neural network can recognize handwritten digits, we will now visualize the weights of the convolutional\n",
    "filters and the resulting output images."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-PgJAwW7s3lB"
   },
   "source": [
    "## Helper-function for plotting convolutional weights"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "mDexd8Grs3lB"
   },
   "outputs": [],
   "source": [
    "def plot_conv_weights(w, input_channel=0):\n",
    "    # Get the lowest and highest values for the weights.\n",
    "    # This is used to correct the colour intensity across\n",
    "    # the images so they can be compared with each other.\n",
    "    w_min = np.min(w)\n",
    "    w_max = np.max(w)\n",
    "    # Number of filters used in the conv. layer.\n",
    "    num_filters = w.shape[3]\n",
    "    # Number of grids to plot.\n",
    "    # Rounded-up, square-root of the number of filters.\n",
    "    num_grids = math.ceil(math.sqrt(num_filters))\n",
    "    # Create figure with a grid of sub-plots.\n",
    "    fig, axes = plt.subplots(num_grids, num_grids, figsize=(13,13))\n",
    "    # Plot all the filter-weights.\n",
    "    for i, ax in enumerate(axes.flat):\n",
    "        # Only plot the valid filter-weights.\n",
    "        if i<num_filters:\n",
    "            # Get the weights for the i'th filter of the input channel.\n",
    "            # See new_conv_layer() for details on the format\n",
    "            # of this 4-dim tensor.\n",
    "            img = w[:, :, input_channel, i]\n",
    "            # Plot image.\n",
    "            ax.imshow(img, vmin=w_min, vmax=w_max,\n",
    "            interpolation='nearest', cmap='seismic')\n",
    "            # Remove ticks from the plot.\n",
    "            ax.set_xticks([])\n",
    "            ax.set_yticks([])\n",
    "    # Ensure the plot is shown correctly with multiple plots\n",
    "    # in a single Notebook cell.\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "z-HXgI0Qs3lB"
   },
   "source": [
    "## Helper-function for plotting the output of a convolutional layer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "AMp5jJTYs3lC"
   },
   "outputs": [],
   "source": [
    "def plot_conv_layer(layer='conv1', image=None):\n",
    "    # Calculate and retrieve the output values of the layer\n",
    "    # when inputting that image.\n",
    "    if len(image.shape)==3:\n",
    "        image = np.expand_dims(image,axis=0)\n",
    "    intermediate_model = keras.models.Model(inputs=model.input, outputs=model.get_layer(layer).output)\n",
    "    values = intermediate_model.predict(image)\n",
    "    # Number of filters used in the conv. layer.\n",
    "    num_filters = values.shape[3]\n",
    "    # Number of grids to plot.\n",
    "    # Rounded-up, square-root of the number of filters.\n",
    "    num_grids = math.ceil(math.sqrt(num_filters))\n",
    "    # Create figure with a grid of sub-plots.\n",
    "    fig, axes = plt.subplots(num_grids, num_grids, figsize=(15,15))\n",
    "    # Plot the output images of all the filters.\n",
    "    for i, ax in enumerate(axes.flat):\n",
    "        # Only plot the images for valid filters.\n",
    "        if i<num_filters:\n",
    "            # Get the output image of using the i'th filter.\n",
    "            # See new_conv_layer() for details on the format\n",
    "            # of this 4-dim tensor.\n",
    "            img = values[0, :, :, i]\n",
    "            # Plot image.\n",
    "            ax.imshow(img, interpolation='nearest', cmap='binary')\n",
    "            # Remove ticks from the plot.\n",
    "            ax.set_xticks([])\n",
    "            ax.set_yticks([])\n",
    "    # Ensure the plot is shown correctly with multiple plots\n",
    "    # in a single Notebook cell.\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ML_qh0Ros3lC"
   },
   "source": [
    "## Network Analysis\n",
    "Given a input image, we are going to observe the ouput of the intermediate layers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "KzN4ixJOs3lC"
   },
   "source": [
    "## Input image\n",
    "We pick an image from the test set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Xv5klYD9s3lC"
   },
   "outputs": [],
   "source": [
    "test_image = test_images[0]\n",
    "\n",
    "plt.imshow(np.squeeze(test_image), interpolation='nearest', cmap='binary')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "H0W77jzds3lC"
   },
   "source": [
    "## First Convolutional Layer\n",
    "Now, we plot the filter-weights for the first convolutional layer.\n",
    "\n",
    "Note that positive weights are red and negative weights are blue."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "obERQU5Xs3lC"
   },
   "outputs": [],
   "source": [
    "# Show weights Conv 1\n",
    "w_conv1, b_conv1 = model.get_layer('conv1').get_weights()\n",
    "plot_conv_weights(w_conv1, input_channel=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "OTcPB-wBs3lD"
   },
   "source": [
    "Applying each of these convolutional filters to the first input image gives the following output images, which are then used as input to the second convolutional layer. Note that these images are downsampled to 14x14 pixels which is half the resolution of the original input image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "u61U9Drqs3lD"
   },
   "outputs": [],
   "source": [
    "# Accepted values for layer are: 'conv1', 'pool1', 'conv2' and 'pool2'\n",
    "plot_conv_layer(layer='conv1', image=test_images[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pvascm_bs3lD"
   },
   "source": [
    "It is difficult to see from these images what the purpose of the convolutional filters might be. It appears that they have merely created several variations of the input image, as if light was shining from different angles and casting shadows in the image."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "H3xxi9bls3lD"
   },
   "source": [
    "## Second Convolutional Layer\n",
    "Now plot the filter-weights for the second convolutional layer.\n",
    "\n",
    "There are 16 output channels from the first conv-layer, which means there are 16 input channels to the second conv-layer. The second conv-layer has a set of filter-weights for each of its input channels. We start by plotting the filter-weigths for the first channel.\n",
    "\n",
    "Note again that positive weights are red and negative weights are blue."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "pB2SPq3Ms3lD"
   },
   "outputs": [],
   "source": [
    "# Show weights Conv 2\n",
    "w_conv2, b_conv2 = model.get_layer('conv2').get_weights()\n",
    "plot_conv_weights(w_conv2, input_channel=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WpI-nvx-s3lD"
   },
   "source": [
    "There are 16 input channels to the second convolutional layer, so we can make another 15 plots of filter-weights like this. We just make one more with the filter-weights for the second channel.\n",
    "\n",
    "It can be difficult to understand and keep track of how these filters are applied because of the high dimensionality.\n",
    "\n",
    "Applying these convolutional filters to the images that were ouput from the first conv-layer gives the following images.\n",
    "\n",
    "Note that these are downsampled yet again to 7x7 pixels which is half the resolution of the images from the first conv-layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "r4jdVWWcs3lD"
   },
   "outputs": [],
   "source": [
    "# Accepted values for layer are: 'conv1', 'pool1', 'conv2' and 'pool2'\n",
    "plot_conv_layer(layer='conv2', image=test_images[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "H2eVuvWCs3lE"
   },
   "source": [
    "From these images, it looks like the second convolutional layer might detect lines and patterns in the input images, which are less sensitive to local variations in the original input images.\n",
    "\n",
    "These images are then flattened and input to the fully connected layer, but that is not shown here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "rxBHFES2s3lE"
   },
   "source": [
    "## Conclusion\n",
    "\n",
    "We have seen how to implement a Convolutional Neural Network recognizing hand-written digits. The Convolutional Network gets a classification accuracy of about 91%."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Zklli6d0s3lE"
   },
   "source": [
    "\n",
    "\n",
    "\n",
    "\n",
    "## License (MIT)\n",
    "\n",
    "The tutorial has been adapted for the computer vision course and converted into the Keras framework by Gianluca Agresti.\n",
    "Comments revised by P. Zanuttigh.\n",
    "\n",
    "Based on the work from Magnus Erik Hvass Pedersen.\n",
    "\n",
    "Copyright (c) 2016 by [Magnus Erik Hvass Pedersen](http://www.hvass-labs.org/)\n",
    "\n",
    "Revised P. Zanuttigh\n",
    "\n",
    "Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n",
    "\n",
    "The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n",
    "\n",
    "THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "eDlMZVr7s3lE"
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "colab": {
   "name": "CNN_tutorial_to_be_solve_2020.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}