{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Neural Networks for Classification, and Clustering\n", "\n", "In this notebook we are going to explore the use of Neural Networks for image classification. We are going to use a dataset of small images of clothes and accessories, the Fashion MNIST. You can find more information regarding the dataset here: https://pravarmahajan.github.io/fashion/\n", "\n", "Each instance in the dataset consist of an image, in a format similar to the digit images you have seen in the previous homework, and a label. The labels correspond to the type of clothing, as follows:\n", "\n", "| Label | Description |\n", "| --- | --- |\n", "| 0 | T-shirt/top |\n", "| 1 | Trouser |\n", "| 2 | Pullover |\n", "| 3 | Dress |\n", "| 4 | Coat |\n", "| 5 | Sandal |\n", "| 6 | Shirt |\n", "| 7 | Sneaker |\n", "| 8 | Bag |\n", "| 9 | Ankle boot |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's first load the required packages." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#load the required packages\n", "\n", "%matplotlib inline \n", "\n", "import numpy as np\n", "import scipy as sp\n", "import matplotlib.pyplot as plt\n", "\n", "import sklearn\n", "from sklearn.neural_network import MLPClassifier\n", "from sklearn.model_selection import GridSearchCV" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following is a function to load the data, that we are going to use later in the notebook." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# helper function to load Fashion MNIST dataset from disk\n", "def load_fashion_mnist(path, kind='train'):\n", " import os\n", " import gzip\n", " import numpy as np\n", " labels_path = os.path.join(path, '%s-labels-idx1-ubyte.gz' % kind)\n", " images_path = os.path.join(path, '%s-images-idx3-ubyte.gz' % kind)\n", " with gzip.open(labels_path, 'rb') as lbpath:\n", " labels = np.frombuffer(lbpath.read(), dtype=np.uint8,offset=8)\n", " with gzip.open(images_path, 'rb') as imgpath:\n", " images = np.frombuffer(imgpath.read(), dtype=np.uint8,offset=16).reshape(len(labels), 784)\n", " return images, labels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 0\n", "Place your ID (\"numero di matricola\") that will be used as seed for random generator. Change the ID number in case you observe unexpected behaviours and want to test if this is due to randomization (e.g., train/test split). If you change the ID number explain why you have change it." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "ID = 2051998\n", "np.random.seed(ID)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we load the dataset using the function above." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "#load the fashion MNIST dataset and normalize the features so that each value is in [0,1]\n", "X, y = load_fashion_mnist(\"data\")\n", "# rescale the data\n", "X = X / 255.0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we split the data into training and test. Make sure that each label is present at least 10 times\n", "in the training set." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Labels in training dataset: [0 1 2 3 4 5 6 7 8 9]\n", "Frequencies in training dataset: [54 42 54 57 50 42 41 49 51 60]\n" ] } ], "source": [ "#random permute the data and split into training and test taking the first 500\n", "#data samples as training and the rest as test\n", "permutation = np.random.permutation(X.shape[0])\n", "\n", "X = X[permutation]\n", "y = y[permutation]\n", "\n", "m_training = 500\n", "\n", "X_train, X_test = X[:m_training], X[m_training:]\n", "y_train, y_test = y[:m_training], y[m_training:]\n", "\n", "labels, freqs = np.unique(y_train, return_counts=True)\n", "print(\"Labels in training dataset: \", labels)\n", "print(\"Frequencies in training dataset: \", freqs)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function plots an image and the corresponding label, to be used to inspect the data when needed." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "#function for plotting a image and printing the corresponding label\n", "def plot_input(X_matrix, labels, index):\n", " print(\"INPUT:\")\n", " plt.imshow(\n", " X_matrix[index].reshape(28,28),\n", " cmap = plt.cm.gray_r,\n", " interpolation = \"nearest\"\n", " )\n", " plt.show()\n", " print(\"LABEL: %i\"%labels[index])\n", " return" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's test the function above and check few images." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INPUT:\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAQPUlEQVR4nO3dbYyV9ZnH8d8lgiAPCs7wICgUJaTGWNqMxIRNda3boDFiX9TUxMZNDNMYSdqk0SXdF/WFL8xmabMvNk1wNWU3KNa0Rl/4AEGj9k3jaCiOS3ZRMlhknAdRHOQZrn0xt80U5r7+x3Ofp+X//SSTM3Ou8z/nP/ec35yZc933/Td3F4AL30XtngCA1iDsQCYIO5AJwg5kgrADmbi4lQ/W1dXly5Yta+VDAlkZGBjQ6OioTVarFHYzWyvp3yRNkfQf7v54dPtly5apr6+vykMCCPT09JTW6v4z3symSPp3SbdLuk7SvWZ2Xb33B6C5qvzPvlrSB+6+z91PStomaV1jpgWg0aqEfbGkv0z4+kBx3d8ws14z6zOzvpGRkQoPB6CKKmGf7E2A8/a9dffN7t7j7j3d3d0VHg5AFVXCfkDSVRO+XiLpYLXpAGiWKmF/W9IKM/uGmU2T9CNJLzZmWgAare7Wm7ufNrMNkl7VeOvtKXd/v2EzA9BQlfrs7v6SpJcaNBcATcTuskAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmKq3iCrzyyithfevWraW1NWvWhGOXLl0a1q+99tqwvmLFirCem0phN7MBSWOSzkg67e49jZgUgMZrxCv737v7aAPuB0AT8T87kImqYXdJ283sHTPrnewGZtZrZn1m1jcyMlLx4QDUq2rY17j7dyTdLukhM/vuuTdw983u3uPuPd3d3RUfDkC9KoXd3Q8Wl8OSnpe0uhGTAtB4dYfdzGaa2eyvPpf0fUn9jZoYgMaq8m78AknPm9lX9/O0u8dNV1xwHn744bDe31/++/+5554Lx544caKuOTXCrFmzwnrqX9IrrrgirK9cubK0tnHjxnDs9ddfH9bL1B12d98n6Vv1jgfQWrTegEwQdiAThB3IBGEHMkHYgUxwiOsF7uzZs2H9oouq/b4fGhoK6wsXLiytFW3bUhdfXO3pOTY2VlpLfd+nT58O60eOHAnrl156aViPDv3dv39/OPatt94K62V4ZQcyQdiBTBB2IBOEHcgEYQcyQdiBTBB2IBP02S9w7t7U+0+daqyrq6u0NmPGjHBsah+BqVOnhvUqY1N9+EceeSSsv/HGG2H98OHDpbUHH3wwHFsvXtmBTBB2IBOEHcgEYQcyQdiBTBB2IBOEHcgEffYL3JQpUyqN//jjjyuNj45Jj3rNtUgdMx71ylPHo0enepak7du3h/UdO3aE9eXLl5fWNmzYEI696667SmvRvgm8sgOZIOxAJgg7kAnCDmSCsAOZIOxAJgg7kAn67B0gdcx5ql713O+R9evXh/WZM2eG9TNnzpTWTp48GY5NHe+e2i7RY8+ePTscu2/fvrB+zz33hPXjx4+H9eHh4dLa2rVrw7HRctLRcyH5LDGzp8xs2Mz6J1w3z8x2mNne4nJu6n4AtFctLwm/lXTur5qNkna6+wpJO4uvAXSwZNjd/U1Jh865ep2kLcXnWyTd3dhpAWi0ev/ZW+Dug5JUXM4vu6GZ9ZpZn5n1pc5XBqB5mv5uvLtvdvced+/p7u5u9sMBKFFv2IfMbJEkFZflby0C6Aj1hv1FSfcXn98v6YXGTAdAsyT77Gb2jKRbJHWZ2QFJv5T0uKTfmdkDkj6S9MNmTvL/u1Q/OLVOecqpU6dKa6nzo7/22mth/eWXXw7rS5cuDeujo6OltVSve/r06WE91YePpNZ+P3HiRFjftGlTWE9tl6iP39/fX1qrIhl2d7+3pPS9Bs8FQBOxuyyQCcIOZIKwA5kg7EAmCDuQCQ5xbYFUa63KoZpS3F5LLXt82223hfVUC+nLL78M69GprKOWoSRdcsklYT21XaPWXaolmWr7jY2NhfW9e/eG9eh737ZtWzj2vvvuC+tleGUHMkHYgUwQdiAThB3IBGEHMkHYgUwQdiAT9NkbINUnT0n1i1OHY0an+1q4cGE49rLLLgvrqbmdPn06rE+bNq20ljqVdGq56aNHj4b16P5TP7PUNk8tFx1931K8f8LWrVvDsfTZAYQIO5AJwg5kgrADmSDsQCYIO5AJwg5kgj57IdXzjfquzVwyWZKeeOKJsN7b21taW7BgQTg21Wf/5JNPwnrqmPNUHz5y7NixsJ7qZUc/l9T+A6kll6v+zKM+/auvvlrpvsvwyg5kgrADmSDsQCYIO5AJwg5kgrADmSDsQCY6qs9e5fjm1LHPKamebRW7d+8O6+vWrQvrQ0NDYX3RokWltVQfPLU0caqfnNru0fnRU2Orzi16PqX6/6nz7aeOd6/Sp08dax+dkz7aZslXdjN7ysyGzax/wnWPmtnHZrar+LgjdT8A2quWP+N/K2ntJNf/2t1XFR8vNXZaABotGXZ3f1PSoRbMBUATVXmDboOZ7S7+zJ9bdiMz6zWzPjPri86VBqC56g37byRdI2mVpEFJm8pu6O6b3b3H3Xu6u7vrfDgAVdUVdncfcvcz7n5W0hOSVjd2WgAara6wm9nEXs8PJPWX3RZAZ0j22c3sGUm3SOoyswOSfinpFjNbJcklDUj6SSMm08xed8qhQ/F7kK+//npp7bHHHgvH7tq1K6x3dXWF9fnz54f1qJ+c6vem1ilPraGeGh/1o1P3XfWY8Sr7ZaT67FXXCkj16SMffvhhaS3qsycf0d3vneTqJ2uaFYCOwe6yQCYIO5AJwg5kgrADmSDsQCY66hDX6NA9SVq/fn1pLXXI4pEjR8L6p59+GtY/++yz0lqqZbh06dKwnmrzpA79jb73VHur6nLRKdH4M2fOhGNTh+em2n6R1M9s+vTpYT213aoeGhwZHBwsrUXPFV7ZgUwQdiAThB3IBGEHMkHYgUwQdiAThB3IREv77KdOnQp7hHfeeWc4fv/+/aW1efPmhWNThySmerbRWXZS/eJUnzzVk03NPer5pvrkqXq0tLAkHT16NKxHc0ttt1QvOrVdon0MUo8d7VfRCDNmzKh7bPRcjJ7HvLIDmSDsQCYIO5AJwg5kgrADmSDsQCYIO5CJlvbZDx06pGeffba0vmTJknD85ZdfXlqL+vdSutc9NjYW1qNjzlPHNqdUWfa4qtHR0bCe2v8g9b1H2y31fafqqV55tA9AanWiFStWhPXU/gkLFy4M63PmzCmtRfuTSNI111xTWovOAcArO5AJwg5kgrADmSDsQCYIO5AJwg5kgrADmWhpn33OnDm69dZbS+upc3V//vnnpbWDBw+GY1PnjU8t2Xz48OG65iVJQ0NDYf3YsWNhPdXrjuqpfvDKlSvD+syZM8N6tO+DJM2dO7e0ljqmO9WrXrx4cVj/6KOPSmupcwgsWLAgrKek9j+IjsUfGBgIx0b7F0T3m3xlN7OrzOx1M9tjZu+b2U+L6+eZ2Q4z21tclv9UAbRdLX/Gn5b0c3f/pqSbJD1kZtdJ2ihpp7uvkLSz+BpAh0qG3d0H3f3d4vMxSXskLZa0TtKW4mZbJN3dpDkCaICv9QadmS2T9G1Jf5K0wN0HpfFfCJLml4zpNbM+M+tr9nm9AJSrOexmNkvS7yX9zN2/qHWcu2929x5374nerAHQXDWF3cymajzoW939D8XVQ2a2qKgvkjTcnCkCaIRk683GewhPStrj7r+aUHpR0v2SHi8uX0jd14wZM3TDDTeU1pcvXx6O37NnT2kt1d5Ktda++KLmP1bOk1ouOnUoZuoQ1uHh+Pdo9PipU0Gn5pZq+1155ZVhvaurq7R29dVXh2NTLainn346rEdtwdQS3VGrVUov+Zw6zfXs2bNLaydOnAjHRqdcj36etfTZ10j6saT3zGxXcd0vNB7y35nZA5I+kvTDGu4LQJskw+7uf5RUtofA9xo7HQDNwu6yQCYIO5AJwg5kgrADmSDsQCZaeohryqxZs8L6jTfe2KKZnC86RHZkZCQcmzqENTrdsiQdP348rEe99OiUxZI0f/6kezn/Vaqf3Ew33XRTWE8d4lrv0sa1SB0imzpcO9quqR599DOLMsQrO5AJwg5kgrADmSDsQCYIO5AJwg5kgrADmeioPnsni/qXqf0DUJ9UL/vmm29u0UwuDLyyA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQCcIOZIKwA5kg7EAmCDuQiWTYzewqM3vdzPaY2ftm9tPi+kfN7GMz21V83NH86QKoVy0nrzgt6efu/q6ZzZb0jpntKGq/dvd/bd70ADRKLeuzD0oaLD4fM7M9kuKlOAB0nK/1P7uZLZP0bUl/Kq7aYGa7zewpM5tbMqbXzPrMrC+1TBKA5qk57GY2S9LvJf3M3b+Q9BtJ10hapfFX/k2TjXP3ze7e4+490dpbAJqrprCb2VSNB32ru/9Bktx9yN3PuPtZSU9IWt28aQKoqpZ3403Sk5L2uPuvJly/aMLNfiCpv/HTA9Aotbwbv0bSjyW9Z2a7iut+IeleM1slySUNSPpJE+YHoEFqeTf+j5JsktJLjZ8OgGZhDzogE4QdyARhBzJB2IFMEHYgE4QdyARhBzJB2IFMEHYgE4QdyARhBzJB2IFMEHYgE4QdyIS5e+sezGxE0v4JV3VJGm3ZBL6eTp1bp85LYm71auTclrr7pOd/a2nYz3twsz5372nbBAKdOrdOnZfE3OrVqrnxZzyQCcIOZKLdYd/c5sePdOrcOnVeEnOrV0vm1tb/2QG0Trtf2QG0CGEHMtGWsJvZWjP7HzP7wMw2tmMOZcxswMzeK5ah7mvzXJ4ys2Ez659w3Twz22Fme4vLSdfYa9PcOmIZ72CZ8bZuu3Yvf97y/9nNbIqk/5X0D5IOSHpb0r3u/t8tnUgJMxuQ1OPubd8Bw8y+K+mIpP909+uL6/5F0iF3f7z4RTnX3f+pQ+b2qKQj7V7Gu1itaNHEZcYl3S3pH9XGbRfM6x61YLu145V9taQP3H2fu5+UtE3SujbMo+O5+5uSDp1z9TpJW4rPt2j8ydJyJXPrCO4+6O7vFp+PSfpqmfG2brtgXi3RjrAvlvSXCV8fUGet9+6StpvZO2bW2+7JTGKBuw9K408eSfPbPJ9zJZfxbqVzlhnvmG1Xz/LnVbUj7JMtJdVJ/b817v4dSbdLeqj4cxW1qWkZ71aZZJnxjlDv8udVtSPsByRdNeHrJZIOtmEek3L3g8XlsKTn1XlLUQ99tYJucTnc5vn8VSct4z3ZMuPqgG3XzuXP2xH2tyWtMLNvmNk0ST+S9GIb5nEeM5tZvHEiM5sp6fvqvKWoX5R0f/H5/ZJeaONc/kanLONdtsy42rzt2r78ubu3/EPSHRp/R/5DSf/cjjmUzGu5pD8XH++3e26SntH4n3WnNP4X0QOSrpC0U9Le4nJeB83tvyS9J2m3xoO1qE1z+zuN/2u4W9Ku4uOOdm+7YF4t2W7sLgtkgj3ogEwQdiAThB3IBGEHMkHYgUwQdiAThB3IxP8BbEk4U0oIvFYAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "LABEL: 9\n", "INPUT:\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAQe0lEQVR4nO3dX4yV9ZkH8O9XBET+CTKyA0Wn23gh2WRpc0I2oo2bZhv1BnrRplw0bKILF5i0sRdrbCJ6p03/pBebRqr86aYradKixBi3BpuY3gBHZBWWuCKwdHDCnBETQBBm4OnFvOyOOOf3DOf3vuc95fl+EjIz55lzzjOH+fIeznN+v5dmBhG58d1UdwMi0h0Ku0gQCrtIEAq7SBAKu0gQN3fzzhYtWmQDAwPdvMvwRkdHk/WRkZFk3ZvWkEzW+/v7k3Up1/HjxzEyMjLpX0pW2Ek+COAXAKYBeMHMnk19/8DAAJrNZs5dhpQTuKGhoeR1t27dmqxfvHgxWZ82bVqy/tRTTyXrKVeuXEnWb7pJT0yv1Wg02tY6frRITgPwbwAeArAcwFqSyzu9PRGpVs4/jSsBHDGzo2Z2CcAOAKvLaUtEypYT9qUA/jzh68Hiss8huZ5kk2Sz1Wpl3J2I5MgJ+2T/UfzCfy7NbLOZNcys0dfXl3F3IpIjJ+yDAJZN+PpLAD7Ka0dEqpIT9n0A7ib5ZZIzAHwXwK5y2hKRsnU8ejOzMZKPAfhPjI/etpjZodI6CyR3xPT666+3rW3bti153eeeey5Zv+uuu5L1F154IVnfsGFD29rzzz+fvK5Ga+XKmrOb2WsAXiupFxGpkP7pFAlCYRcJQmEXCUJhFwlCYRcJQmEXCaKr69mjqnqp5quvvtq2tmPHjqzb9jz66KPJempJ8zPPPJO87qZNm5J1b/ntzJkzk/VodGQXCUJhFwlCYRcJQmEXCUJhFwlCYRcJQqO3LhgbG0vWZ8yYkazv3bs3WT9//vx193SVNxb0drb1dpe955572tZ27tyZvK43epProyO7SBAKu0gQCrtIEAq7SBAKu0gQCrtIEAq7SBCas3eBN8v2DA4OJuv79u3r+La95bUXLlxI1mfNmpWsp94DcOTIkeR1PVrCen10ZBcJQmEXCUJhFwlCYRcJQmEXCUJhFwlCYRcJQnP2Lrj55ryHeXh4OFnv6+vLuv0Ub62957777mtb27p1a9Zte1Lvb4h4Ouis30KSxwGcBXAZwJiZNcpoSkTKV8aR/R/NbKSE2xGRCsV7LiMSVG7YDcAfSL5Ncv1k30ByPckmyWar1cq8OxHpVG7YV5nZ1wA8BGAjya9f+w1mttnMGmbWqPKFJBFJywq7mX1UfBwGsBPAyjKaEpHydRx2krNJzr36OYBvAjhYVmMiUq6cV+MXA9hJ8urt/IeZvV5KVzeY4jHqmLemfOXKzp9QeXva575H4P77729bW7JkSdZte7w976Pp+G/SzI4C+PsSexGRCmn0JhKEwi4ShMIuEoTCLhKEwi4ShJa4dkHu6O3YsWPJ+vz587NuP8XbBjtnqeiZM2eS9YsXLybr3lbSGr19no7sIkEo7CJBKOwiQSjsIkEo7CJBKOwiQSjsIkFozt4FudsWV7mVdO57AHJcunQpWf/www+T9eXLl5fZzg1PR3aRIBR2kSAUdpEgFHaRIBR2kSAUdpEgFHaRIDRnL3hrn3PWRufO2T/77LNk/dy5cx3fdpXr1T3e6aBPnDiRrGvOfn10ZBcJQmEXCUJhFwlCYRcJQmEXCUJhFwlCYRcJQnP2greuu85132+++Wayfu+993Z829OnT+/4urneeeedZP3kyZNZt597uukbjXtkJ7mF5DDJgxMuW0jyDZIfFB8XVNumiOSaytP4bQAevOayJwDsNrO7AewuvhaRHuaG3czeAnD6motXA9hefL4dwJpy2xKRsnX6At1iMxsCgOLjHe2+keR6kk2SzVar1eHdiUiuyl+NN7PNZtYws0bOxogikqfTsJ8i2Q8Axcf09qciUrtOw74LwLri83UAXimnHRGpijuIJPkSgAcALCI5CGATgGcB/JbkIwBOAPh2lU1Gt2BBerI5b968LnVSrltuuSVZr3ItfURu2M1sbZvSN0ruRUQqpH86RYJQ2EWCUNhFglDYRYJQ2EWCYM4Wyder0WhYs9ns2v1N9PjjjyfrL7/8crKe2q7ZO/WwN0KaOXNmsu5tmXz69LVLF/7f0aNHk9fNXdrrbQedetfkrbfemnXfZ86cSdZTv9sLFy5MXtfbYttbfrtnz55kfcmSJcl6pxqNBprN5qQPnI7sIkEo7CJBKOwiQSjsIkEo7CJBKOwiQSjsIkGE2Wt3//79yfrAwECyPjY21rZ29uzZ5HVzt6H2Tsmcmid7M3rvdNCjo6PJurcV9axZs9rWLly4kHXb3lbRqZ/NW157++23J+up9zYAwNDQULJe1Zw9RUd2kSAUdpEgFHaRIBR2kSAUdpEgFHaRIBR2kSDCzNk906ZNS9bnzp3btubN2b09A7z17DlbKnvrzUdGRpJ1bw5/2223Jetz5sxpW/PeP+CtOffm7DmnbPYec+9xvXz5csf3XRUd2UWCUNhFglDYRYJQ2EWCUNhFglDYRYJQ2EWCCDNn9+amqfXqQHpm683RvT3IvfXuOb17e9qvWrUqWU+tRweAwcHBZP3jjz9uW/Pe2+CtpffWw8+fP79tzXtvw+zZs5P1Q4cOJeuffPJJsl4H98hOcgvJYZIHJ1z2NMmTJA8Ufx6utk0RyTWVp/HbADw4yeU/N7MVxZ/Xym1LRMrmht3M3gKQ3oNHRHpezgt0j5F8t3iav6DdN5FcT7JJstlqtTLuTkRydBr2XwL4CoAVAIYA/LTdN5rZZjNrmFkjdZI/EalWR2E3s1NmdtnMrgD4FYCV5bYlImXrKOwk+yd8+S0AB9t9r4j0BnfOTvIlAA8AWERyEMAmAA+QXAHAABwHsKG6FsvhrT/2zvWdmgl7+5t7M/zz588n6968OTWH//TTT5PX9fbT93r31m2n5tXe+wu89wh4s/LU3vDeueFz5Z4roApu2M1s7SQXv1hBLyJSIb1dViQIhV0kCIVdJAiFXSQIhV0kCC1xLXhbJuds5+yNp7zb9rZETi2h9UZA3s+du/w2NTb0lrh6S4O90Vvq+t6yZG/c6fF6r4OO7CJBKOwiQSjsIkEo7CJBKOwiQSjsIkEo7CJBhJmzp5Y7AnnzZm8ZqLcE1lvK6c3pU/Nmrzdv1u3Vvd5Sj5s3w/eWHc+bNy9ZT22D7c3Bvb8TT+6cvgo6sosEobCLBKGwiwShsIsEobCLBKGwiwShsIsEEWbO7q1f9taM5/Bmurnr3VMzXW+W7a0Jz5nxA+k5f86M3rttIO80254qf1+qoiO7SBAKu0gQCrtIEAq7SBAKu0gQCrtIEAq7SBB/fcPCDl28eDFZ9+au3kw3peo9xFPzau/nqnJfeI+3Vt6bZXv7BKR+dm/G7/Xm/T7kzvGr4B7ZSS4j+UeSh0keIvn94vKFJN8g+UHxcUH17YpIp6byNH4MwA/N7B4A/wBgI8nlAJ4AsNvM7gawu/haRHqUG3YzGzKz/cXnZwEcBrAUwGoA24tv2w5gTUU9ikgJrusFOpIDAL4KYA+AxWY2BIz/gwDgjjbXWU+ySbLZarUy2xWRTk057CTnAPgdgB+YWXonwAnMbLOZNcys0dfX10mPIlKCKYWd5HSMB/03Zvb74uJTJPuLej+A4WpaFJEyuKM3js9mXgRw2Mx+NqG0C8A6AM8WH1+ppMOSeKMSbxSTqnsjonPnziXr3virSt5YMHd5bupnyx1/eWO/CxcutK3NmTMn67Y93liwDlOZs68C8D0A75E8UFz2JMZD/luSjwA4AeDblXQoIqVww25mfwLQ7p/nb5TbjohURW+XFQlCYRcJQmEXCUJhFwlCYRcJIswSV28W7s2Tc5a4erzlkN48OsVbopq7VDPnlM+5c/acvzPvvQ3eKbw9Xu910JFdJAiFXSQIhV0kCIVdJAiFXSQIhV0kCIVdJIgwc/ZZs2Yl6zlbSefO4L379ubJ3iw957a93rz7ztnm2pvDz5gxI1lPrWf3rpu7Hv2vcitpEbkxKOwiQSjsIkEo7CJBKOwiQSjsIkEo7CJBhJmzV7mPd+4pmXPn6Knr587JvXXfOT+7t+Y7dz/9VG9V/lyAv39CHXRkFwlCYRcJQmEXCUJhFwlCYRcJQmEXCUJhFwliKudnXwbg1wD+BsAVAJvN7BcknwbwLwBaxbc+aWavVdVorqVLlybre/fuTdZz5vQzZ85M1qvckz5nz3nAn0fnrKX35uw57y8A0rPu3MfFs3jx4kpvvxNTmfyPAfihme0nORfA2yTfKGo/N7OfVNeeiJRlKudnHwIwVHx+luRhAOnDpIj0nOt6DkZyAMBXAewpLnqM5Lskt5Bc0OY660k2STZbrdZk3yIiXTDlsJOcA+B3AH5gZmcA/BLAVwCswPiR/6eTXc/MNptZw8wafX19+R2LSEemFHaS0zEe9N+Y2e8BwMxOmdllM7sC4FcAVlbXpojkcsPO8ZdjXwRw2Mx+NuHy/gnf9i0AB8tvT0TKMpVX41cB+B6A90geKC57EsBakisAGIDjADZU0F9pFi1alKwfO3YsWU8tFfVei8hdqpmzLbE3nsodf+X+bFXedur63jj0zjvvzLpvb9Rbh6m8Gv8nAJM9aj07UxeRL9I76ESCUNhFglDYRYJQ2EWCUNhFglDYRYLovf1uK7Jx48Zk/f3330/W582b17bmzcG9eXHurDt1/6Ojo8nreks9vTm997NVOYf3pB5X75TN3uO2Zs2aZH3BgkmXitRKR3aRIBR2kSAUdpEgFHaRIBR2kSAUdpEgFHaRIJizVvq674xsAfjfCRctAjDStQauT6/21qt9AeqtU2X2dpeZTbr/W1fD/oU7J5tm1qitgYRe7a1X+wLUW6e61ZuexosEobCLBFF32DfXfP8pvdpbr/YFqLdOdaW3Wv/PLiLdU/eRXUS6RGEXCaKWsJN8kOT7JI+QfKKOHtoheZzkeyQPkGzW3MsWksMkD064bCHJN0h+UHysZeF0m96eJnmyeOwOkHy4pt6WkfwjycMkD5H8fnF5rY9doq+uPG5d/z87yWkA/gfAPwEYBLAPwFoz+++uNtIGyeMAGmZW+xswSH4dwDkAvzazvysu+zGA02b2bPEP5QIz+9ce6e1pAOfqPo13cbai/omnGQewBsA/o8bHLtHXd9CFx62OI/tKAEfM7KiZXQKwA8DqGvroeWb2FoDT11y8GsD24vPtGP9l6bo2vfUEMxsys/3F52cBXD3NeK2PXaKvrqgj7EsB/HnC14PorfO9G4A/kHyb5Pq6m5nEYjMbAsZ/eQDcUXM/13JP491N15xmvGceu05Of56rjrBPtilZL83/VpnZ1wA8BGBj8XRVpmZKp/HulklOM94TOj39ea46wj4IYNmEr78E4KMa+piUmX1UfBwGsBO9dyrqU1fPoFt8HK65n//TS6fxnuw04+iBx67O05/XEfZ9AO4m+WWSMwB8F8CuGvr4ApKzixdOQHI2gG+i905FvQvAuuLzdQBeqbGXz+mV03i3O804an7saj/9uZl1/Q+AhzH+ivyHAH5URw9t+vpbAP9V/DlUd28AXsL407pRjD8jegTA7QB2A/ig+Liwh3r7dwDvAXgX48Hqr6m3+zD+X8N3ARwo/jxc92OX6Ksrj5veLisShN5BJxKEwi4ShMIuEoTCLhKEwi4ShMIuEoTCLhLEXwBNE6OxcGf3fQAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "LABEL: 8\n", "INPUT:\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAQU0lEQVR4nO3dX4he9Z3H8c/XJEaTTE00owwxmrYEXFldK4MsuNRIUdQb04uG5qK4GEgFhRYqrHaReuGFLGvLCosQV2lcu5aCihFktyKV0JviKGqi0Y3G2EaHmYkmJjGJJpnvXsxxmcY5v+94zvMPv+8XDDPzfOc85zfneT7zzMz3/M7P3F0Avv7O6PcAAPQGYQeSIOxAEoQdSIKwA0ks7OXOVq5c6WvWrOnlLoFU9u7dq/3799tctVZhN7MbJP2bpAWS/sPd7y99/Zo1azQ2NtZmlwAKRkdHa2uNf403swWS/l3SjZIulbTRzC5ten8AuqvN3+xXSXrH3fe4++eSfivp5s4MC0CntQn7Kkl/mfX5vuq2v2Jmm81szMzGpqamWuwOQBttwj7XPwG+dO6tu29x91F3Hx0eHm6xOwBttAn7PkmrZ31+oaQP2w0HQLe0CftLktaa2TfN7ExJP5S0rTPDAtBpjVtv7n7SzO6Q9D+aab096u5vdGxkADqqVZ/d3Z+T9FyHxgKgizhdFkiCsANJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJEHYgCcIOJEHYgSQIO5BEqyWbzWyvpMOSTkk66e6jnRgUgM5rFfbKte6+vwP3A6CL+DUeSKJt2F3S783sZTPbPNcXmNlmMxszs7GpqamWuwPQVNuwX+3uV0q6UdLtZvbd07/A3be4+6i7jw4PD7fcHYCmWoXd3T+s3k9KelrSVZ0YFIDOaxx2M1tqZkNffCzpekk7OzUwAJ3V5r/xF0h62sy+uJ//cvf/7sioGjh69Gix/tRTTxXrk5OTxfqqVatqa+5e3PbkyZPF+hlnlH/mRvXS/Uf7jlSPb+N6NPaSxYsXF+snTpwo1o8fP15bO3jwYHHbI0eOFOtDQ0PF+oYNG4r10vMpMj093Wi7xmF39z2S/q7p9gB6i9YbkARhB5Ig7EAShB1IgrADSXRiIsxAuPXWW4v1t99+u1iP2mdnn312ba3U4pHatZ+kuNVy7Nix2lo0tiVLlhTr0dgXLFhQrE9MTNTWorZd6ZhL8XEpPaYjIyPFbaO2X+n7kqQ333yzWH/44YeL9ZKmzyde2YEkCDuQBGEHkiDsQBKEHUiCsANJEHYgiZ722Y8dO6bXXnuttr5u3bri9vfcc09t7fPPPy9ue/755xfrUb/4rLPOalST4um30dgjp06darzvqB5ZuLD8FLroootqa1EvO+qzR49ZqY8fPWaRM888s1h/9913i/XHH3+8tnbfffcVt922bVttrXReBa/sQBKEHUiCsANJEHYgCcIOJEHYgSQIO5CERfO4O2nZsmV++eWX19Y/++yz4valXvmKFSuK25bmfEvlXrVU7ulG27adzx71k0uisUU9/ui4RX320v6jsUV9+KhXvmjRosb7js4/iM7beP/994v1/fvr10IdHx8vbrtx48ba2pNPPqnJyck5TzDglR1IgrADSRB2IAnCDiRB2IEkCDuQBGEHkujpfPbp6eniUrgXX3xxcftDhw7V1qJrkEd91eh8g1I9uu+oH9xmXrZU7nVHc8Ijn3zySbEejb10bKJr2kfLTUf7Ln3vbZfR/vTTT4v1AwcOFOulcwii58uzzz5bWystRR2+spvZo2Y2aWY7Z912rpk9b2a7q/flM1oA9N18fo3/taQbTrvtLkkvuPtaSS9UnwMYYGHY3X27pI9Pu/lmSVurj7dKWt/ZYQHotKZ/s1/g7uOS5O7jZlZ7orCZbZa0WSqfqwygu7r+33h33+Luo+4+Gk2aANA9TcM+YWYjklS9n+zckAB0Q9Owb5N0S/XxLZKe6cxwAHRL+Hu1mT0haZ2klWa2T9IvJN0v6XdmtknSnyX9YD47W716tR588MHa+t13313cvtQ3jXqu0dzoqFce9YRLomuMRz3dNtcciP5PEs1XbzOXXirPl4/GFtXbzEmPjmn0mJT62ZJ0+PDhxve/fPny4rYPPfRQbe22226rrYVhd/e6mfLfi7YFMDg4XRZIgrADSRB2IAnCDiRB2IEkenpK29DQUHFZ5ksuuaS4fbQMbknUSonaPKWz/6I2TtQimp6eLtZPnDhRrJemwEZnLUaX747GFh3XUusuaklGUz2jdmiprdj2MtbR9tH3tm/fvtranXfeWdz22muvra0NDQ3V1nhlB5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkBurSMVFPt9TPji4N3PYqOUuWLKmtRcseRz3ZSHQp6dJxiS55HJ0jEPWyo/MTStOSo+m1bY9r6XuLpu5Gz6fouVpaklmSRkZGamsbNmwobtsUr+xAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kMRA9dkfeOCBYv26666rrUWX3436ptH2pTnlUb+42yvhlL63aL551KtuM189Es1Xjy7H3Ob8hej8gdLS4pK0cuXKYj16zNeuXVtbu+yyy4rbNsUrO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kMVB99uuvv75YL/U2o7nPpXnVknTOOecU63v27KmtRb3maD56JLr/0rztaN512z57pDS26NrqpWugS/GyyaVeeXQt/ui4vffee8X66tWri/Xt27fX1jZt2lTc9pFHHinW64SPpJk9amaTZrZz1m33mtkHZvZq9XZTo70D6Jn5/Nj+taQb5rj9V+5+RfX2XGeHBaDTwrC7+3ZJH/dgLAC6qM0fZHeY2evVr/kr6r7IzDab2ZiZjU1NTbXYHYA2mob9IUnflnSFpHFJtTNY3H2Lu4+6++jw8HDD3QFoq1HY3X3C3U+5+7SkhyVd1dlhAei0RmE3s9nXwf2+pJ11XwtgMIR9djN7QtI6SSvNbJ+kX0haZ2ZXSHJJeyX9eD47O3bsmHbs2FFbX7p0aXH78847r7Z26NCh4rYXXnhhsd5mrfCoj952jfNIm3ndba9pH63vXrr/aM53dE37SOlxiebSR8+H6NyH6DEvndexe/fu4rZNhWF3941z3Nysqw+gbzhdFkiCsANJEHYgCcIOJEHYgSR6OsXV3YtTC6Nphx988EFtrbSksiRNTEyUBxcoXa45ahFF7a2ozdNmKeto39Fxi7RpK0Ytx+i4Ru2vUj26/HdUP3DgQLEeHffSlOuojVy6xHbp8eCVHUiCsANJEHYgCcIOJEHYgSQIO5AEYQeS6GmffcmSJbryyiuL9ZJSHz7qe+7fv79YX7ZsWbFemn4bTZeMlouOLoMdKfV0o+m34+PjxXp07kOb5aij+47OP2jTp287LTm6NHl0/6WpwYsXLy5uW7rEdvG8huK9AvjaIOxAEoQdSIKwA0kQdiAJwg4kQdiBJHraZz958qQ++uij2nrU6y7NT44uaRzNT47mHx8/frxr992mV93WN77xjb7tO9L2EtulXnnp8ZSko0ePFutRL7w051wqP2ei8wua4pUdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5LoaYN34cKFxWWXI2+99VZtLeqLHjlypFiP5i+X5qRH9x31TaOebZt+czSvetGiRcV6dG32SOkcg+i+o+sARPPh2+w7Om7RNe2j+e4HDx6srUXfV1Phs8jMVpvZH8xsl5m9YWY/qW4/18yeN7Pd1fsVXRkhgI6Yz0vGSUk/c/e/kfT3km43s0sl3SXpBXdfK+mF6nMAAyoMu7uPu/sr1ceHJe2StErSzZK2Vl+2VdL6Lo0RQAd8pT8GzWyNpO9I+pOkC9x9XJr5gSDp/JptNpvZmJmNTU1NtRwugKbmHXYzWybpSUk/dffyynOzuPsWdx9199Hh4eEmYwTQAfMKu5kt0kzQf+PuT1U3T5jZSFUfkTTZnSEC6ISw9WYzPYhHJO1y91/OKm2TdIuk+6v3z7QdzGOPPVasv/jii43vO2pvlS4VHYmm17Zt80TaXDI5aiG1XVa5TfsrmvobtahKSx9H20bfdzS2aOpw6RLe11xzTXHbpubTZ79a0o8k7TCzV6vbfq6ZkP/OzDZJ+rOkH3RlhAA6Igy7u/9RUt3Lw/c6OxwA3cLpskAShB1IgrADSRB2IAnCDiTRv2sYz2H58uXF+vr163syDuDriFd2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IIgy7ma02sz+Y2S4ze8PMflLdfq+ZfWBmr1ZvN3V/uACams8iEScl/czdXzGzIUkvm9nzVe1X7v6v3RsegE6Zz/rs45LGq48Pm9kuSau6PTAAnfWV/mY3szWSviPpT9VNd5jZ62b2qJmtqNlms5mNmdnY1NRUu9ECaGzeYTezZZKelPRTdz8k6SFJ35Z0hWZe+R+Yazt33+Luo+4+Ojw83H7EABqZV9jNbJFmgv4bd39Kktx9wt1Pufu0pIclXdW9YQJoaz7/jTdJj0ja5e6/nHX7yKwv+76knZ0fHoBOmc9/46+W9CNJO8zs1eq2n0vaaGZXSHJJeyX9uAvjA9Ah8/lv/B8l2Ryl5zo/HADdwhl0QBKEHUiCsANJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kQdiBJMzde7czsylJ78+6aaWk/T0bwFczqGMb1HFJjK2pTo7tYnef8/pvPQ37l3ZuNubuo30bQMGgjm1QxyUxtqZ6NTZ+jQeSIOxAEv0O+5Y+779kUMc2qOOSGFtTPRlbX/9mB9A7/X5lB9AjhB1Ioi9hN7MbzOxtM3vHzO7qxxjqmNleM9tRLUM91uexPGpmk2a2c9Zt55rZ82a2u3o/5xp7fRrbQCzjXVhmvK/Hrt/Ln/f8b3YzWyDpfyVdJ2mfpJckbXT3N3s6kBpmtlfSqLv3/QQMM/uupCOSHnP3v61u+xdJH7v7/dUPyhXu/k8DMrZ7JR3p9zLe1WpFI7OXGZe0XtI/qo/HrjCuDerBcevHK/tVkt5x9z3u/rmk30q6uQ/jGHjuvl3Sx6fdfLOkrdXHWzXzZOm5mrENBHcfd/dXqo8PS/pimfG+HrvCuHqiH2FfJekvsz7fp8Fa790l/d7MXjazzf0ezBwucPdxaebJI+n8Po/ndOEy3r102jLjA3Psmix/3lY/wj7XUlKD1P+72t2vlHSjpNurX1cxP/NaxrtX5lhmfCA0Xf68rX6EfZ+k1bM+v1DSh30Yx5zc/cPq/aSkpzV4S1FPfLGCbvV+ss/j+X+DtIz3XMuMawCOXT+XP+9H2F+StNbMvmlmZ0r6oaRtfRjHl5jZ0uofJzKzpZKu1+AtRb1N0i3Vx7dIeqaPY/krg7KMd90y4+rzsev78ufu3vM3STdp5j/y70r6536MoWZc35L0WvX2Rr/HJukJzfxad0IzvxFtknSepBck7a7enztAY/tPSTskva6ZYI30aWz/oJk/DV+X9Gr1dlO/j11hXD05bpwuCyTBGXRAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kMT/ATUivi8jflfhAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "LABEL: 8\n" ] } ], "source": [ "#let's try the plotting function\n", "plot_input(X_train,y_train,10)\n", "plot_input(X_test,y_test,50)\n", "plot_input(X_test,y_test,300)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 1\n", "\n", "Now use a (feed-forward) Neural Network for prediction. Use the multi-layer perceptron (MLP) classifier MLPClassifier(...) in scikit-learn, with the following parameters: max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, random_state=ID (this last parameter ensures the run is the same even if you run it more than once). The alpha parameter is the regularization parameter for L2 regularization that is used by the MLP in sklearn.\n", "\n", "Then, using the default activation function, pick four or five architectures to consider, with different numbers of hidden layers and different sizes. It is not necessary to create huge neural networks, you can limit to 3 layers and, for each layer, its maximum size can be of 100. You can evaluate the architectures you chose using the GridSearchCV with a 5-fold cross-validation, and use the results to pick the best architecture. The code below provides some architectures you can use, but you can choose other ones if you prefer.\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "RESULTS FOR NN\n", "\n", "Best parameters set found:\n", "{'hidden_layer_sizes': (50,)}\n", "Score with best parameters:\n", "0.792\n", "\n", "All scores on the grid:\n", "{'mean_fit_time': array([0.67497864, 0.96655545, 0.26948581, 0.59302826]), 'std_fit_time': array([0.16834269, 0.03805584, 0.03209772, 0.22659953]), 'mean_score_time': array([0.00099649, 0.00119662, 0.00079808, 0.00099711]), 'std_score_time': array([7.62939453e-07, 3.97920966e-04, 3.99042696e-04, 6.64157308e-07]), 'param_hidden_layer_sizes': masked_array(data=[(10,), (50,), (10, 10), (50, 50)],\n", " mask=[False, False, False, False],\n", " fill_value='?',\n", " dtype=object), 'params': [{'hidden_layer_sizes': (10,)}, {'hidden_layer_sizes': (50,)}, {'hidden_layer_sizes': (10, 10)}, {'hidden_layer_sizes': (50, 50)}], 'split0_test_score': array([0.38, 0.8 , 0.63, 0.75]), 'split1_test_score': array([0.71, 0.81, 0.55, 0.79]), 'split2_test_score': array([0.74, 0.79, 0.78, 0.77]), 'split3_test_score': array([0.72, 0.77, 0.6 , 0.79]), 'split4_test_score': array([0.7 , 0.79, 0.1 , 0.78]), 'mean_test_score': array([0.65 , 0.792, 0.532, 0.776]), 'std_test_score': array([0.1356466 , 0.0132665 , 0.22920733, 0.01496663]), 'rank_test_score': array([3, 1, 4, 2])}\n" ] } ], "source": [ "#MLPclassifier requires in input the parameter hidden_layer_sizes, that is a tuple specifying the number of \n", "#neurons in the hidden layers; for example: (10,) means that there is only 1 hidden layer with 10 neurons; \n", "#(10,50) means that there are 2 hidden layers, the first with 10 neurons, the second with 50 neurons\n", "\n", "#these are examples of possible architectures you can test, but feel free to use different architectures! \n", "hl_parameters = {'hidden_layer_sizes': [(10,), (50,), (10,10,), (50,50,)]}\n", "\n", "mlp_cv = MLPClassifier(max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, random_state=ID)\n", "gridSearch = GridSearchCV(mlp_cv, hl_parameters, cv=5)\n", "gridSearch.fit(X_train,y_train)\n", " \n", "print ('RESULTS FOR NN\\n')\n", "\n", "print(\"Best parameters set found:\")\n", "print(gridSearch.best_params_)\n", "\n", "print(\"Score with best parameters:\")\n", "print(gridSearch.best_score_)\n", "\n", "print(\"\\nAll scores on the grid:\")\n", "print(gridSearch.cv_results_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 2\n", "\n", "What do you observe for different architectures and their scores? How do the number of layers and their sizes affect the performances?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result for 10 neurons and 1 hidden layer the score is 0.65\n", "The result for 50 neurons and 1 hidden layer the score is 0.792\n", "The result for 10 neurons and 2 hidden layer the score is 0.532\n", "The result for 50 neurons and 2 hidden layer the score is 0.776 \n", "\n", "Best result is obtained when the number of neurons are increased whereas there is no much change in result when hidden layer is increased" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 3\n", "\n", "Now get training and test error (according to the initial split) for a NN with best parameters chosen from the cross-validation above (and learning the NN weights from the entire training set). Use verbose=True\n", "in input so to see how loss changes in iterations. (Note that the loss used by the MLPclassifier may be different from the 0-1 loss, also called *accuracy*.)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 2.23491903\n", "Iteration 2, loss = 1.65312581\n", "Iteration 3, loss = 1.39238340\n", "Iteration 4, loss = 0.94868744\n", "Iteration 5, loss = 0.95214853\n", "Iteration 6, loss = 0.82783418\n", "Iteration 7, loss = 0.68006396\n", "Iteration 8, loss = 0.60126940\n", "Iteration 9, loss = 0.55428932\n", "Iteration 10, loss = 0.50118175\n", "Iteration 11, loss = 0.48019322\n", "Iteration 12, loss = 0.42746870\n", "Iteration 13, loss = 0.40621904\n", "Iteration 14, loss = 0.32164951\n", "Iteration 15, loss = 0.32727930\n", "Iteration 16, loss = 0.38149245\n", "Iteration 17, loss = 0.26987695\n", "Iteration 18, loss = 0.27374162\n", "Iteration 19, loss = 0.23157931\n", "Iteration 20, loss = 0.31955497\n", "Iteration 21, loss = 0.20953823\n", "Iteration 22, loss = 0.19146667\n", "Iteration 23, loss = 0.17265256\n", "Iteration 24, loss = 0.18594496\n", "Iteration 25, loss = 0.27227754\n", "Iteration 26, loss = 0.21490140\n", "Iteration 27, loss = 0.16040825\n", "Iteration 28, loss = 0.14964326\n", "Iteration 29, loss = 0.17283541\n", "Iteration 30, loss = 0.20290663\n", "Iteration 31, loss = 0.17095906\n", "Iteration 32, loss = 0.09971637\n", "Iteration 33, loss = 0.08826478\n", "Iteration 34, loss = 0.08538536\n", "Iteration 35, loss = 0.07678185\n", "Iteration 36, loss = 0.08431189\n", "Iteration 37, loss = 0.07979650\n", "Iteration 38, loss = 0.16448143\n", "Iteration 39, loss = 0.14163033\n", "Iteration 40, loss = 0.06614859\n", "Iteration 41, loss = 0.05559328\n", "Iteration 42, loss = 0.05971223\n", "Iteration 43, loss = 0.05334380\n", "Iteration 44, loss = 0.04477627\n", "Iteration 45, loss = 0.04780368\n", "Iteration 46, loss = 0.03780384\n", "Iteration 47, loss = 0.03618613\n", "Iteration 48, loss = 0.03317051\n", "Iteration 49, loss = 0.03438012\n", "Iteration 50, loss = 0.03317725\n", "Iteration 51, loss = 0.03470568\n", "Iteration 52, loss = 0.02813227\n", "Iteration 53, loss = 0.02649521\n", "Iteration 54, loss = 0.02694411\n", "Iteration 55, loss = 0.02455538\n", "Iteration 56, loss = 0.02638552\n", "Iteration 57, loss = 0.03315480\n", "Iteration 58, loss = 0.02160265\n", "Iteration 59, loss = 0.02012935\n", "Iteration 60, loss = 0.02028987\n", "Iteration 61, loss = 0.01864363\n", "Iteration 62, loss = 0.01998649\n", "Iteration 63, loss = 0.01780610\n", "Iteration 64, loss = 0.01812111\n", "Iteration 65, loss = 0.01588105\n", "Iteration 66, loss = 0.01517374\n", "Iteration 67, loss = 0.01502516\n", "Iteration 68, loss = 0.01454002\n", "Iteration 69, loss = 0.01388756\n", "Iteration 70, loss = 0.01382663\n", "Iteration 71, loss = 0.01312566\n", "Iteration 72, loss = 0.01306304\n", "Iteration 73, loss = 0.01221908\n", "Iteration 74, loss = 0.01241014\n", "Iteration 75, loss = 0.01169710\n", "Iteration 76, loss = 0.01174548\n", "Iteration 77, loss = 0.01102452\n", "Iteration 78, loss = 0.01096250\n", "Iteration 79, loss = 0.01042870\n", "Iteration 80, loss = 0.01021668\n", "Iteration 81, loss = 0.00997063\n", "Iteration 82, loss = 0.00981628\n", "Iteration 83, loss = 0.00968432\n", "Iteration 84, loss = 0.00935086\n", "Iteration 85, loss = 0.00931229\n", "Iteration 86, loss = 0.00915253\n", "Iteration 87, loss = 0.00879651\n", "Iteration 88, loss = 0.00870157\n", "Iteration 89, loss = 0.00847841\n", "Iteration 90, loss = 0.00833403\n", "Iteration 91, loss = 0.00817054\n", "Iteration 92, loss = 0.00798444\n", "Iteration 93, loss = 0.00795310\n", "Iteration 94, loss = 0.00784930\n", "Iteration 95, loss = 0.00756874\n", "Iteration 96, loss = 0.00754739\n", "Iteration 97, loss = 0.00746602\n", "Iteration 98, loss = 0.00714852\n", "Iteration 99, loss = 0.00711265\n", "Iteration 100, loss = 0.00695263\n", "Iteration 101, loss = 0.00687172\n", "Iteration 102, loss = 0.00669678\n", "Iteration 103, loss = 0.00674397\n", "Iteration 104, loss = 0.00653825\n", "Iteration 105, loss = 0.00648908\n", "Iteration 106, loss = 0.00640641\n", "Iteration 107, loss = 0.00639659\n", "Iteration 108, loss = 0.00617655\n", "Iteration 109, loss = 0.00616551\n", "Iteration 110, loss = 0.00610756\n", "Iteration 111, loss = 0.00612010\n", "Iteration 112, loss = 0.00598451\n", "Iteration 113, loss = 0.00575291\n", "Iteration 114, loss = 0.00569262\n", "Iteration 115, loss = 0.00561780\n", "Iteration 116, loss = 0.00551546\n", "Iteration 117, loss = 0.00552306\n", "Iteration 118, loss = 0.00536921\n", "Iteration 119, loss = 0.00532779\n", "Iteration 120, loss = 0.00523108\n", "Iteration 121, loss = 0.00513925\n", "Iteration 122, loss = 0.00514707\n", "Iteration 123, loss = 0.00508249\n", "Iteration 124, loss = 0.00496849\n", "Iteration 125, loss = 0.00493705\n", "Iteration 126, loss = 0.00482176\n", "Iteration 127, loss = 0.00485244\n", "Iteration 128, loss = 0.00474512\n", "Iteration 129, loss = 0.00471414\n", "Iteration 130, loss = 0.00466621\n", "Iteration 131, loss = 0.00455180\n", "Iteration 132, loss = 0.00457542\n", "Iteration 133, loss = 0.00446377\n", "Iteration 134, loss = 0.00443313\n", "Iteration 135, loss = 0.00434429\n", "Iteration 136, loss = 0.00431274\n", "Iteration 137, loss = 0.00426423\n", "Iteration 138, loss = 0.00424446\n", "Iteration 139, loss = 0.00417999\n", "Iteration 140, loss = 0.00421593\n", "Iteration 141, loss = 0.00417801\n", "Iteration 142, loss = 0.00411158\n", "Training loss did not improve more than tol=0.000100 for 10 consecutive epochs. Stopping.\n", "\n", "RESULTS FOR BEST NN\n", "\n", "Best NN training error: 0.000000\n", "Best NN test error: 0.212403\n" ] } ], "source": [ "#get training and test error for the best NN model from CV\n", "\n", "mlp = MLPClassifier(max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, random_state=ID, hidden_layer_sizes=(25,), verbose=True)\n", "mlp.fit(X_train,y_train)\n", "\n", "training_error = 1. - mlp.score(X_train, y_train)\n", "\n", "test_error = 1. - mlp.score(X_test, y_test)\n", "\n", "print ('\\nRESULTS FOR BEST NN\\n')\n", "\n", "print (\"Best NN training error: %f\" % training_error)\n", "print (\"Best NN test error: %f\" % test_error)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## More data \n", "Now let's do the same but using 10000 (or less if it takes too long on your machine) data points for training. Use the same NN architectures as before, but you can try more if you want!" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Labels and frequencies in training dataset: \n" ] }, { "data": { "text/plain": [ "(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8),\n", " array([1034, 994, 997, 1009, 993, 995, 993, 1021, 993, 971],\n", " dtype=int64))" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = X[permutation]\n", "y = y[permutation]\n", "\n", "m_training = 10000\n", "\n", "X_train, X_test = X[:m_training], X[m_training:]\n", "y_train, y_test = y[:m_training], y[m_training:]\n", "\n", "print(\"Labels and frequencies in training dataset: \")\n", "np.unique(y_train, return_counts=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 4\n", "\n", "Now train the NNs with the added data points. Feel free to try more different architectures than before if you want, or less if it takes too much time. You can use 'verbose=True' so have an idea of how long it takes to run 1 iteration (eventually reduce also the number of iterations to 50)." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Fitting 5 folds for each of 4 candidates, totalling 20 fits\n", "Iteration 1, loss = 1.23587398\n", "Iteration 2, loss = 0.65936105\n", "Iteration 3, loss = 0.54151440\n", "Iteration 4, loss = 0.49385153\n", "Iteration 5, loss = 0.48494783\n", "Iteration 6, loss = 0.45150179\n", "Iteration 7, loss = 0.43163933\n", "Iteration 8, loss = 0.41310924\n", "Iteration 9, loss = 0.41431347\n", "Iteration 10, loss = 0.39808539\n", "Iteration 11, loss = 0.39731951\n", "Iteration 12, loss = 0.38004774\n", "Iteration 13, loss = 0.36683556\n", "Iteration 14, loss = 0.37763441\n", "Iteration 15, loss = 0.35815620\n", "Iteration 16, loss = 0.35895180\n", "Iteration 17, loss = 0.35689411\n", "Iteration 18, loss = 0.34525345\n", "Iteration 19, loss = 0.35070440\n", "Iteration 20, loss = 0.34649555\n", "Iteration 21, loss = 0.33038446\n", "Iteration 22, loss = 0.32675327\n", "Iteration 23, loss = 0.32080759\n", "Iteration 24, loss = 0.31963739\n", "Iteration 25, loss = 0.33073655\n", "Iteration 26, loss = 0.31867702\n", "Iteration 27, loss = 0.31356816\n", "Iteration 28, loss = 0.31014858\n", "Iteration 29, loss = 0.30637664\n", "Iteration 30, loss = 0.31438370\n", "Iteration 31, loss = 0.29798886\n", "Iteration 32, loss = 0.31016254\n", "Iteration 33, loss = 0.29558043\n", "Iteration 34, loss = 0.28373417\n", "Iteration 35, loss = 0.28224505\n", "Iteration 36, loss = 0.28231645\n", "Iteration 37, loss = 0.28412738\n", "Iteration 38, loss = 0.28435457\n", "Iteration 39, loss = 0.28335815\n", "Iteration 40, loss = 0.28409554\n", "Iteration 41, loss = 0.28290619\n", "Iteration 42, loss = 0.27143385\n", "Iteration 43, loss = 0.28191156\n", "Iteration 44, loss = 0.27640979\n", "Iteration 45, loss = 0.27513747\n", "Iteration 46, loss = 0.27475180\n", "Iteration 47, loss = 0.27790929\n", "Iteration 48, loss = 0.26216122\n", "Iteration 49, loss = 0.26681233\n", "Iteration 50, loss = 0.25372820\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.25726189\n", "Iteration 2, loss = 0.66855680\n", "Iteration 3, loss = 0.53680780\n", "Iteration 4, loss = 0.49045834\n", "Iteration 5, loss = 0.48981272\n", "Iteration 6, loss = 0.45749934\n", "Iteration 7, loss = 0.44039103\n", "Iteration 8, loss = 0.42130033\n", "Iteration 9, loss = 0.42144670\n", "Iteration 10, loss = 0.39993472\n", "Iteration 11, loss = 0.40494530\n", "Iteration 12, loss = 0.37465901\n", "Iteration 13, loss = 0.38146512\n", "Iteration 14, loss = 0.37827126\n", "Iteration 15, loss = 0.36655644\n", "Iteration 16, loss = 0.36725868\n", "Iteration 17, loss = 0.36140347\n", "Iteration 18, loss = 0.35081333\n", "Iteration 19, loss = 0.35022994\n", "Iteration 20, loss = 0.34213559\n", "Iteration 21, loss = 0.34614536\n", "Iteration 22, loss = 0.34979740\n", "Iteration 23, loss = 0.32829712\n", "Iteration 24, loss = 0.32939645\n", "Iteration 25, loss = 0.33373366\n", "Iteration 26, loss = 0.32015835\n", "Iteration 27, loss = 0.32405950\n", "Iteration 28, loss = 0.31224441\n", "Iteration 29, loss = 0.30837069\n", "Iteration 30, loss = 0.31267110\n", "Iteration 31, loss = 0.31259982\n", "Iteration 32, loss = 0.30787576\n", "Iteration 33, loss = 0.30532307\n", "Iteration 34, loss = 0.28999011\n", "Iteration 35, loss = 0.30014887\n", "Iteration 36, loss = 0.29316484\n", "Iteration 37, loss = 0.28590764\n", "Iteration 38, loss = 0.29521087\n", "Iteration 39, loss = 0.29619921\n", "Iteration 40, loss = 0.29684911\n", "Iteration 41, loss = 0.28760052\n", "Iteration 42, loss = 0.28404509\n", "Iteration 43, loss = 0.27958895\n", "Iteration 44, loss = 0.27581128\n", "Iteration 45, loss = 0.27074622\n", "Iteration 46, loss = 0.28412877\n", "Iteration 47, loss = 0.29037676\n", "Iteration 48, loss = 0.28556805\n", "Iteration 49, loss = 0.26591713\n", "Iteration 50, loss = 0.26836327\n", "Iteration 1, loss = 1.25882218\n", "Iteration 2, loss = 0.66430507\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 3, loss = 0.56480652\n", "Iteration 4, loss = 0.50085580\n", "Iteration 5, loss = 0.49854607\n", "Iteration 6, loss = 0.46649683\n", "Iteration 7, loss = 0.44780806\n", "Iteration 8, loss = 0.41994749\n", "Iteration 9, loss = 0.41006418\n", "Iteration 10, loss = 0.40460474\n", "Iteration 11, loss = 0.39546968\n", "Iteration 12, loss = 0.37240638\n", "Iteration 13, loss = 0.37805694\n", "Iteration 14, loss = 0.37384895\n", "Iteration 15, loss = 0.36150245\n", "Iteration 16, loss = 0.34677600\n", "Iteration 17, loss = 0.35083096\n", "Iteration 18, loss = 0.34951451\n", "Iteration 19, loss = 0.34287402\n", "Iteration 20, loss = 0.33651471\n", "Iteration 21, loss = 0.33988896\n", "Iteration 22, loss = 0.33439686\n", "Iteration 23, loss = 0.32448438\n", "Iteration 24, loss = 0.31781136\n", "Iteration 25, loss = 0.31203681\n", "Iteration 26, loss = 0.31249348\n", "Iteration 27, loss = 0.31793986\n", "Iteration 28, loss = 0.30961641\n", "Iteration 29, loss = 0.30364679\n", "Iteration 30, loss = 0.32068186\n", "Iteration 31, loss = 0.29978208\n", "Iteration 32, loss = 0.30462007\n", "Iteration 33, loss = 0.29399890\n", "Iteration 34, loss = 0.29085137\n", "Iteration 35, loss = 0.28603925\n", "Iteration 36, loss = 0.28255823\n", "Iteration 37, loss = 0.28215683\n", "Iteration 38, loss = 0.28261378\n", "Iteration 39, loss = 0.28308682\n", "Iteration 40, loss = 0.28936598\n", "Iteration 41, loss = 0.27760208\n", "Iteration 42, loss = 0.26315881\n", "Iteration 43, loss = 0.27324739\n", "Iteration 44, loss = 0.26930160\n", "Iteration 45, loss = 0.26716015\n", "Iteration 46, loss = 0.27418343\n", "Iteration 47, loss = 0.25384500\n", "Iteration 48, loss = 0.27040876\n", "Iteration 49, loss = 0.27482216\n", "Iteration 50, loss = 0.26416137\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.24567495\n", "Iteration 2, loss = 0.63442133\n", "Iteration 3, loss = 0.53288675\n", "Iteration 4, loss = 0.47039829\n", "Iteration 5, loss = 0.45439512\n", "Iteration 6, loss = 0.44070402\n", "Iteration 7, loss = 0.43161351\n", "Iteration 8, loss = 0.40379499\n", "Iteration 9, loss = 0.40560964\n", "Iteration 10, loss = 0.39292444\n", "Iteration 11, loss = 0.38335170\n", "Iteration 12, loss = 0.37458042\n", "Iteration 13, loss = 0.36426545\n", "Iteration 14, loss = 0.36204614\n", "Iteration 15, loss = 0.36573608\n", "Iteration 16, loss = 0.35025439\n", "Iteration 17, loss = 0.33526211\n", "Iteration 18, loss = 0.34980969\n", "Iteration 19, loss = 0.33844228\n", "Iteration 20, loss = 0.32909306\n", "Iteration 21, loss = 0.33285837\n", "Iteration 22, loss = 0.32269523\n", "Iteration 23, loss = 0.33350310\n", "Iteration 24, loss = 0.31526877\n", "Iteration 25, loss = 0.31531497\n", "Iteration 26, loss = 0.31068174\n", "Iteration 27, loss = 0.32296418\n", "Iteration 28, loss = 0.30516691\n", "Iteration 29, loss = 0.29803251\n", "Iteration 30, loss = 0.30803810\n", "Iteration 31, loss = 0.30439571\n", "Iteration 32, loss = 0.30146680\n", "Iteration 33, loss = 0.30078061\n", "Iteration 34, loss = 0.29407722\n", "Iteration 35, loss = 0.28237979\n", "Iteration 36, loss = 0.27150145\n", "Iteration 37, loss = 0.28371185\n", "Iteration 38, loss = 0.27395914\n", "Iteration 39, loss = 0.28011491\n", "Iteration 40, loss = 0.28045460\n", "Iteration 41, loss = 0.27568192\n", "Iteration 42, loss = 0.26725151\n", "Iteration 43, loss = 0.27804991\n", "Iteration 44, loss = 0.26584110\n", "Iteration 45, loss = 0.27424995\n", "Iteration 46, loss = 0.25890470\n", "Iteration 47, loss = 0.26594668\n", "Iteration 48, loss = 0.27099261\n", "Iteration 49, loss = 0.26108780\n", "Iteration 50, loss = 0.27804844\n", "Iteration 1, loss = 1.30948067\n", "Iteration 2, loss = 0.67971655\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 3, loss = 0.61032328\n", "Iteration 4, loss = 0.54860803\n", "Iteration 5, loss = 0.52952644\n", "Iteration 6, loss = 0.51848945\n", "Iteration 7, loss = 0.48460041\n", "Iteration 8, loss = 0.45719661\n", "Iteration 9, loss = 0.44266024\n", "Iteration 10, loss = 0.45234613\n", "Iteration 11, loss = 0.42037097\n", "Iteration 12, loss = 0.41549225\n", "Iteration 13, loss = 0.41661733\n", "Iteration 14, loss = 0.40809612\n", "Iteration 15, loss = 0.40565847\n", "Iteration 16, loss = 0.39056021\n", "Iteration 17, loss = 0.38709168\n", "Iteration 18, loss = 0.37590114\n", "Iteration 19, loss = 0.37736672\n", "Iteration 20, loss = 0.37750103\n", "Iteration 21, loss = 0.38363303\n", "Iteration 22, loss = 0.36499897\n", "Iteration 23, loss = 0.37173106\n", "Iteration 24, loss = 0.35957142\n", "Iteration 25, loss = 0.35759035\n", "Iteration 26, loss = 0.35429087\n", "Iteration 27, loss = 0.34860665\n", "Iteration 28, loss = 0.35514605\n", "Iteration 29, loss = 0.34553763\n", "Iteration 30, loss = 0.34903192\n", "Iteration 31, loss = 0.33948746\n", "Iteration 32, loss = 0.33143027\n", "Iteration 33, loss = 0.33483746\n", "Iteration 34, loss = 0.33940775\n", "Iteration 35, loss = 0.31548699\n", "Iteration 36, loss = 0.32329209\n", "Iteration 37, loss = 0.31819712\n", "Iteration 38, loss = 0.32314009\n", "Iteration 39, loss = 0.32364978\n", "Iteration 40, loss = 0.30842670\n", "Iteration 41, loss = 0.31406647\n", "Iteration 42, loss = 0.30753535\n", "Iteration 43, loss = 0.31933669\n", "Iteration 44, loss = 0.30834456\n", "Iteration 45, loss = 0.31216225\n", "Iteration 46, loss = 0.30992621\n", "Iteration 47, loss = 0.30738717\n", "Iteration 48, loss = 0.31731703\n", "Iteration 49, loss = 0.30683002\n", "Iteration 50, loss = 0.30742934\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.02204925\n", "Iteration 2, loss = 0.52846944\n", "Iteration 3, loss = 0.47154031\n", "Iteration 4, loss = 0.42223781\n", "Iteration 5, loss = 0.40212757\n", "Iteration 6, loss = 0.38273035\n", "Iteration 7, loss = 0.35142492\n", "Iteration 8, loss = 0.34452651\n", "Iteration 9, loss = 0.33769786\n", "Iteration 10, loss = 0.31860835\n", "Iteration 11, loss = 0.30582747\n", "Iteration 12, loss = 0.29510442\n", "Iteration 13, loss = 0.29111403\n", "Iteration 14, loss = 0.28135521\n", "Iteration 15, loss = 0.27076683\n", "Iteration 16, loss = 0.25606552\n", "Iteration 17, loss = 0.25187788\n", "Iteration 18, loss = 0.24518380\n", "Iteration 19, loss = 0.23577627\n", "Iteration 20, loss = 0.24206249\n", "Iteration 21, loss = 0.23233894\n", "Iteration 22, loss = 0.21781768\n", "Iteration 23, loss = 0.21698257\n", "Iteration 24, loss = 0.21587081\n", "Iteration 25, loss = 0.23187125\n", "Iteration 26, loss = 0.19548814\n", "Iteration 27, loss = 0.19236970\n", "Iteration 28, loss = 0.18282426\n", "Iteration 29, loss = 0.18801362\n", "Iteration 30, loss = 0.18943795\n", "Iteration 31, loss = 0.17717575\n", "Iteration 32, loss = 0.18247080\n", "Iteration 33, loss = 0.17143786\n", "Iteration 34, loss = 0.17263383\n", "Iteration 35, loss = 0.15719093\n", "Iteration 36, loss = 0.15491916\n", "Iteration 37, loss = 0.14938968\n", "Iteration 38, loss = 0.14709868\n", "Iteration 39, loss = 0.14558968\n", "Iteration 40, loss = 0.14272278\n", "Iteration 41, loss = 0.14254849\n", "Iteration 42, loss = 0.13923524\n", "Iteration 43, loss = 0.12667952\n", "Iteration 44, loss = 0.12399331\n", "Iteration 45, loss = 0.12308288\n", "Iteration 46, loss = 0.12297097\n", "Iteration 47, loss = 0.12364132\n", "Iteration 48, loss = 0.12298727\n", "Iteration 49, loss = 0.11518959\n", "Iteration 50, loss = 0.11036485\n", "Iteration 1, loss = 0.97585508\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 2, loss = 0.53182140\n", "Iteration 3, loss = 0.46414198\n", "Iteration 4, loss = 0.42823289\n", "Iteration 5, loss = 0.40332828\n", "Iteration 6, loss = 0.38605251\n", "Iteration 7, loss = 0.35325965\n", "Iteration 8, loss = 0.34517668\n", "Iteration 9, loss = 0.32801200\n", "Iteration 10, loss = 0.31721743\n", "Iteration 11, loss = 0.30514699\n", "Iteration 12, loss = 0.29973250\n", "Iteration 13, loss = 0.28771119\n", "Iteration 14, loss = 0.29683461\n", "Iteration 15, loss = 0.27255984\n", "Iteration 16, loss = 0.26717333\n", "Iteration 17, loss = 0.25453813\n", "Iteration 18, loss = 0.25186008\n", "Iteration 19, loss = 0.24410071\n", "Iteration 20, loss = 0.23633097\n", "Iteration 21, loss = 0.22871941\n", "Iteration 22, loss = 0.22368264\n", "Iteration 23, loss = 0.23170120\n", "Iteration 24, loss = 0.21802035\n", "Iteration 25, loss = 0.21295836\n", "Iteration 26, loss = 0.19552538\n", "Iteration 27, loss = 0.19079388\n", "Iteration 28, loss = 0.19060359\n", "Iteration 29, loss = 0.20147476\n", "Iteration 30, loss = 0.18654306\n", "Iteration 31, loss = 0.17193075\n", "Iteration 32, loss = 0.17967590\n", "Iteration 33, loss = 0.17168587\n", "Iteration 34, loss = 0.17031909\n", "Iteration 35, loss = 0.16577684\n", "Iteration 36, loss = 0.15473001\n", "Iteration 37, loss = 0.15556070\n", "Iteration 38, loss = 0.15337017\n", "Iteration 39, loss = 0.16800401\n", "Iteration 40, loss = 0.14416282\n", "Iteration 41, loss = 0.13979444\n", "Iteration 42, loss = 0.13899102\n", "Iteration 43, loss = 0.13323407\n", "Iteration 44, loss = 0.12509203\n", "Iteration 45, loss = 0.11819994\n", "Iteration 46, loss = 0.13441720\n", "Iteration 47, loss = 0.13105934\n", "Iteration 48, loss = 0.12057777\n", "Iteration 49, loss = 0.12699677\n", "Iteration 50, loss = 0.12175675\n", "Iteration 1, loss = 0.96094659\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 2, loss = 0.51974390\n", "Iteration 3, loss = 0.45325911\n", "Iteration 4, loss = 0.41473674\n", "Iteration 5, loss = 0.39674561\n", "Iteration 6, loss = 0.36618403\n", "Iteration 7, loss = 0.34419002\n", "Iteration 8, loss = 0.33715489\n", "Iteration 9, loss = 0.32245070\n", "Iteration 10, loss = 0.30553541\n", "Iteration 11, loss = 0.29209471\n", "Iteration 12, loss = 0.29317372\n", "Iteration 13, loss = 0.26462440\n", "Iteration 14, loss = 0.27508225\n", "Iteration 15, loss = 0.26029545\n", "Iteration 16, loss = 0.25612316\n", "Iteration 17, loss = 0.24115598\n", "Iteration 18, loss = 0.24173323\n", "Iteration 19, loss = 0.23183028\n", "Iteration 20, loss = 0.22474933\n", "Iteration 21, loss = 0.21061501\n", "Iteration 22, loss = 0.22825001\n", "Iteration 23, loss = 0.21832399\n", "Iteration 24, loss = 0.20124838\n", "Iteration 25, loss = 0.19930125\n", "Iteration 26, loss = 0.19193117\n", "Iteration 27, loss = 0.17539607\n", "Iteration 28, loss = 0.18501286\n", "Iteration 29, loss = 0.17898176\n", "Iteration 30, loss = 0.17332242\n", "Iteration 31, loss = 0.16104996\n", "Iteration 32, loss = 0.16261552\n", "Iteration 33, loss = 0.17476651\n", "Iteration 34, loss = 0.14933432\n", "Iteration 35, loss = 0.15353900\n", "Iteration 36, loss = 0.15331781\n", "Iteration 37, loss = 0.15939206\n", "Iteration 38, loss = 0.15093825\n", "Iteration 39, loss = 0.14245524\n", "Iteration 40, loss = 0.13844049\n", "Iteration 41, loss = 0.13472301\n", "Iteration 42, loss = 0.12298521\n", "Iteration 43, loss = 0.12250876\n", "Iteration 44, loss = 0.12139805\n", "Iteration 45, loss = 0.10833346\n", "Iteration 46, loss = 0.11405142\n", "Iteration 47, loss = 0.12239429\n", "Iteration 48, loss = 0.11171105\n", "Iteration 49, loss = 0.10840508\n", "Iteration 50, loss = 0.10168689\n", "Iteration 1, loss = 0.95892386\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 2, loss = 0.52175089\n", "Iteration 3, loss = 0.44924300\n", "Iteration 4, loss = 0.41527248\n", "Iteration 5, loss = 0.38806532\n", "Iteration 6, loss = 0.36411046\n", "Iteration 7, loss = 0.34045827\n", "Iteration 8, loss = 0.33171795\n", "Iteration 9, loss = 0.31323272\n", "Iteration 10, loss = 0.29896372\n", "Iteration 11, loss = 0.29178119\n", "Iteration 12, loss = 0.27879445\n", "Iteration 13, loss = 0.26551544\n", "Iteration 14, loss = 0.25804216\n", "Iteration 15, loss = 0.25668616\n", "Iteration 16, loss = 0.24412757\n", "Iteration 17, loss = 0.23954891\n", "Iteration 18, loss = 0.23136672\n", "Iteration 19, loss = 0.22183790\n", "Iteration 20, loss = 0.21302625\n", "Iteration 21, loss = 0.20949064\n", "Iteration 22, loss = 0.19858183\n", "Iteration 23, loss = 0.20047975\n", "Iteration 24, loss = 0.19244317\n", "Iteration 25, loss = 0.17979579\n", "Iteration 26, loss = 0.19814194\n", "Iteration 27, loss = 0.17806842\n", "Iteration 28, loss = 0.17185939\n", "Iteration 29, loss = 0.16195061\n", "Iteration 30, loss = 0.16017796\n", "Iteration 31, loss = 0.14474421\n", "Iteration 32, loss = 0.15335383\n", "Iteration 33, loss = 0.15015871\n", "Iteration 34, loss = 0.13779954\n", "Iteration 35, loss = 0.14258985\n", "Iteration 36, loss = 0.13652991\n", "Iteration 37, loss = 0.14201725\n", "Iteration 38, loss = 0.12368117\n", "Iteration 39, loss = 0.12939435\n", "Iteration 40, loss = 0.12166353\n", "Iteration 41, loss = 0.11794014\n", "Iteration 42, loss = 0.11609789\n", "Iteration 43, loss = 0.10669622\n", "Iteration 44, loss = 0.10682459\n", "Iteration 45, loss = 0.09665414\n", "Iteration 46, loss = 0.09766512\n", "Iteration 47, loss = 0.10298660\n", "Iteration 48, loss = 0.10355953\n", "Iteration 49, loss = 0.09616742\n", "Iteration 50, loss = 0.09007099\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 0.96600744\n", "Iteration 2, loss = 0.53171388\n", "Iteration 3, loss = 0.46750628\n", "Iteration 4, loss = 0.42301923\n", "Iteration 5, loss = 0.40428242\n", "Iteration 6, loss = 0.37839088\n", "Iteration 7, loss = 0.35415706\n", "Iteration 8, loss = 0.33286763\n", "Iteration 9, loss = 0.32382092\n", "Iteration 10, loss = 0.31302465\n", "Iteration 11, loss = 0.30524384\n", "Iteration 12, loss = 0.29467402\n", "Iteration 13, loss = 0.27544181\n", "Iteration 14, loss = 0.27205985\n", "Iteration 15, loss = 0.25919888\n", "Iteration 16, loss = 0.25725857\n", "Iteration 17, loss = 0.24281670\n", "Iteration 18, loss = 0.23384900\n", "Iteration 19, loss = 0.22801817\n", "Iteration 20, loss = 0.21696062\n", "Iteration 21, loss = 0.21218997\n", "Iteration 22, loss = 0.20751432\n", "Iteration 23, loss = 0.20113966\n", "Iteration 24, loss = 0.20591105\n", "Iteration 25, loss = 0.18255101\n", "Iteration 26, loss = 0.19970217\n", "Iteration 27, loss = 0.17548770\n", "Iteration 28, loss = 0.17053012\n", "Iteration 29, loss = 0.17441495\n", "Iteration 30, loss = 0.16341984\n", "Iteration 31, loss = 0.16657656\n", "Iteration 32, loss = 0.15616780\n", "Iteration 33, loss = 0.16031924\n", "Iteration 34, loss = 0.15965583\n", "Iteration 35, loss = 0.16212723\n", "Iteration 36, loss = 0.14236236\n", "Iteration 37, loss = 0.13783089\n", "Iteration 38, loss = 0.14379320\n", "Iteration 39, loss = 0.13061418\n", "Iteration 40, loss = 0.12052991\n", "Iteration 41, loss = 0.12540855\n", "Iteration 42, loss = 0.12770946\n", "Iteration 43, loss = 0.11244005\n", "Iteration 44, loss = 0.11126138\n", "Iteration 45, loss = 0.10375458\n", "Iteration 46, loss = 0.11694992\n", "Iteration 47, loss = 0.09817465\n", "Iteration 48, loss = 0.11264426\n", "Iteration 49, loss = 0.10122205\n", "Iteration 50, loss = 0.09426802\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.50596079\n", "Iteration 2, loss = 0.78306291\n", "Iteration 3, loss = 0.66862017\n", "Iteration 4, loss = 0.61425531\n", "Iteration 5, loss = 0.58161571\n", "Iteration 6, loss = 1.10381399\n", "Iteration 7, loss = 0.74807862\n", "Iteration 8, loss = 0.68002147\n", "Iteration 9, loss = 0.63949818\n", "Iteration 10, loss = 0.61660400\n", "Iteration 11, loss = 0.63157803\n", "Iteration 12, loss = 0.59300351\n", "Iteration 13, loss = 0.66610799\n", "Iteration 14, loss = 0.54659553\n", "Iteration 15, loss = 0.55575851\n", "Iteration 16, loss = 0.66114856\n", "Iteration 17, loss = 0.52863999\n", "Iteration 18, loss = 0.51462429\n", "Iteration 19, loss = 0.52268634\n", "Iteration 20, loss = 0.49797843\n", "Iteration 21, loss = 0.48704602\n", "Iteration 22, loss = 0.50661741\n", "Iteration 23, loss = 0.50777080\n", "Iteration 24, loss = 0.48971024\n", "Iteration 25, loss = 0.48611527\n", "Iteration 26, loss = 0.45183718\n", "Iteration 27, loss = 0.51057214\n", "Iteration 28, loss = 0.46187869\n", "Iteration 29, loss = 0.47391236\n", "Iteration 30, loss = 0.46809656\n", "Iteration 31, loss = 0.46336839\n", "Iteration 32, loss = 0.57206749\n", "Iteration 33, loss = 0.72110750\n", "Iteration 34, loss = 0.52240417\n", "Iteration 35, loss = 0.49203012\n", "Iteration 36, loss = 0.48474009\n", "Iteration 37, loss = 0.46548140\n", "Training loss did not improve more than tol=0.000100 for 10 consecutive epochs. Stopping.\n", "Iteration 1, loss = 1.48750784\n", "Iteration 2, loss = 0.73618081\n", "Iteration 3, loss = 0.61483931\n", "Iteration 4, loss = 0.55407019\n", "Iteration 5, loss = 0.52193140\n", "Iteration 6, loss = 0.49962742\n", "Iteration 7, loss = 0.47062381\n", "Iteration 8, loss = 0.47258315\n", "Iteration 9, loss = 0.44271921\n", "Iteration 10, loss = 0.44069825\n", "Iteration 11, loss = 0.42940523\n", "Iteration 12, loss = 0.41482167\n", "Iteration 13, loss = 0.42048456\n", "Iteration 14, loss = 0.45806478\n", "Iteration 15, loss = 0.38697696\n", "Iteration 16, loss = 0.37556057\n", "Iteration 17, loss = 0.38009025\n", "Iteration 18, loss = 0.37034703\n", "Iteration 19, loss = 0.36497068\n", "Iteration 20, loss = 0.36520950\n", "Iteration 21, loss = 0.35862645\n", "Iteration 22, loss = 0.36361351\n", "Iteration 23, loss = 0.36165391\n", "Iteration 24, loss = 0.35141268\n", "Iteration 25, loss = 0.34926668\n", "Iteration 26, loss = 0.34683364\n", "Iteration 27, loss = 0.34265338\n", "Iteration 28, loss = 0.34418021\n", "Iteration 29, loss = 0.34769456\n", "Iteration 30, loss = 0.34170370\n", "Iteration 31, loss = 0.33500391\n", "Iteration 32, loss = 0.32728501\n", "Iteration 33, loss = 0.32516021\n", "Iteration 34, loss = 0.33129035\n", "Iteration 35, loss = 0.32651349\n", "Iteration 36, loss = 0.31285517\n", "Iteration 37, loss = 0.32382849\n", "Iteration 38, loss = 0.30559876\n", "Iteration 39, loss = 0.30754627\n", "Iteration 40, loss = 0.32464322\n", "Iteration 41, loss = 0.31127617\n", "Iteration 42, loss = 0.31249142\n", "Iteration 43, loss = 0.30733111\n", "Iteration 44, loss = 0.30794308\n", "Iteration 45, loss = 0.31640767\n", "Iteration 46, loss = 0.30365742\n", "Iteration 47, loss = 0.29571778\n", "Iteration 48, loss = 0.31040693\n", "Iteration 49, loss = 0.29935034\n", "Iteration 50, loss = 0.29717260\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.61212130\n", "Iteration 2, loss = 0.84165622\n", "Iteration 3, loss = 0.73136499\n", "Iteration 4, loss = 0.61192339\n", "Iteration 5, loss = 0.58014223\n", "Iteration 6, loss = 0.52982466\n", "Iteration 7, loss = 0.51492270\n", "Iteration 8, loss = 0.48876669\n", "Iteration 9, loss = 0.47158576\n", "Iteration 10, loss = 0.45716926\n", "Iteration 11, loss = 0.45061351\n", "Iteration 12, loss = 0.45783216\n", "Iteration 13, loss = 0.45511450\n", "Iteration 14, loss = 0.43157310\n", "Iteration 15, loss = 0.41386226\n", "Iteration 16, loss = 0.41814264\n", "Iteration 17, loss = 0.56434483\n", "Iteration 18, loss = 0.56990229\n", "Iteration 19, loss = 0.52493504\n", "Iteration 20, loss = 0.48556350\n", "Iteration 21, loss = 0.42142954\n", "Iteration 22, loss = 0.41291576\n", "Iteration 23, loss = 0.39988230\n", "Iteration 24, loss = 0.40518525\n", "Iteration 25, loss = 0.37978262\n", "Iteration 26, loss = 0.39033700\n", "Iteration 27, loss = 0.37290709\n", "Iteration 28, loss = 0.36744534\n", "Iteration 29, loss = 0.37513233\n", "Iteration 30, loss = 0.38773153\n", "Iteration 31, loss = 0.35670741\n", "Iteration 32, loss = 0.37197005\n", "Iteration 33, loss = 0.36705113\n", "Iteration 34, loss = 0.37091030\n", "Iteration 35, loss = 0.36117313\n", "Iteration 36, loss = 0.34972240\n", "Iteration 37, loss = 0.35569489\n", "Iteration 38, loss = 0.35411527\n", "Iteration 39, loss = 0.36224098\n", "Iteration 40, loss = 0.36580690\n", "Iteration 41, loss = 0.34734537\n", "Iteration 42, loss = 0.34379709\n", "Iteration 43, loss = 0.39697675\n", "Iteration 44, loss = 0.34850803\n", "Iteration 45, loss = 0.34508361\n", "Iteration 46, loss = 0.34315703\n", "Iteration 47, loss = 0.33341565\n", "Iteration 48, loss = 0.33099732\n", "Iteration 49, loss = 0.33624408\n", "Iteration 50, loss = 0.34430641\n", "Iteration 1, loss = 1.51881196\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 2, loss = 0.72590740\n", "Iteration 3, loss = 0.59085393\n", "Iteration 4, loss = 0.62251764\n", "Iteration 5, loss = 0.50552388\n", "Iteration 6, loss = 0.47575259\n", "Iteration 7, loss = 0.44577893\n", "Iteration 8, loss = 0.43104103\n", "Iteration 9, loss = 0.45651430\n", "Iteration 10, loss = 0.41914389\n", "Iteration 11, loss = 0.42053119\n", "Iteration 12, loss = 0.39916406\n", "Iteration 13, loss = 0.39131062\n", "Iteration 14, loss = 0.37703261\n", "Iteration 15, loss = 0.36904458\n", "Iteration 16, loss = 0.36867115\n", "Iteration 17, loss = 0.37710297\n", "Iteration 18, loss = 0.36749427\n", "Iteration 19, loss = 0.34980918\n", "Iteration 20, loss = 0.34558213\n", "Iteration 21, loss = 0.35132048\n", "Iteration 22, loss = 0.34189141\n", "Iteration 23, loss = 0.34546951\n", "Iteration 24, loss = 0.34815492\n", "Iteration 25, loss = 0.33855280\n", "Iteration 26, loss = 0.32424476\n", "Iteration 27, loss = 0.33000597\n", "Iteration 28, loss = 0.31560710\n", "Iteration 29, loss = 0.33255892\n", "Iteration 30, loss = 0.30405490\n", "Iteration 31, loss = 0.31691217\n", "Iteration 32, loss = 0.45449737\n", "Iteration 33, loss = 0.34884748\n", "Iteration 34, loss = 0.29872583\n", "Iteration 35, loss = 0.29970063\n", "Iteration 36, loss = 0.29566792\n", "Iteration 37, loss = 0.29395850\n", "Iteration 38, loss = 0.29058405\n", "Iteration 39, loss = 0.29161186\n", "Iteration 40, loss = 0.28183430\n", "Iteration 41, loss = 0.28064025\n", "Iteration 42, loss = 0.28073392\n", "Iteration 43, loss = 0.30484289\n", "Iteration 44, loss = 0.29408262\n", "Iteration 45, loss = 0.28961766\n", "Iteration 46, loss = 0.27292929\n", "Iteration 47, loss = 0.27816627\n", "Iteration 48, loss = 0.26553391\n", "Iteration 49, loss = 0.28006050\n", "Iteration 50, loss = 0.26295555\n", "Iteration 1, loss = 1.36732745\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 2, loss = 0.68920707\n", "Iteration 3, loss = 0.59043888\n", "Iteration 4, loss = 0.53608848\n", "Iteration 5, loss = 0.48489437\n", "Iteration 6, loss = 0.47016579\n", "Iteration 7, loss = 0.46567266\n", "Iteration 8, loss = 0.43416912\n", "Iteration 9, loss = 0.43761624\n", "Iteration 10, loss = 0.41512321\n", "Iteration 11, loss = 0.39641129\n", "Iteration 12, loss = 0.40919292\n", "Iteration 13, loss = 0.38361126\n", "Iteration 14, loss = 0.39041898\n", "Iteration 15, loss = 0.37258137\n", "Iteration 16, loss = 0.39026287\n", "Iteration 17, loss = 0.36274936\n", "Iteration 18, loss = 0.35579164\n", "Iteration 19, loss = 0.36106502\n", "Iteration 20, loss = 0.34763550\n", "Iteration 21, loss = 0.33679279\n", "Iteration 22, loss = 0.33498936\n", "Iteration 23, loss = 0.32817132\n", "Iteration 24, loss = 0.32928483\n", "Iteration 25, loss = 0.32155907\n", "Iteration 26, loss = 0.32369708\n", "Iteration 27, loss = 0.33086357\n", "Iteration 28, loss = 0.31961061\n", "Iteration 29, loss = 0.30641233\n", "Iteration 30, loss = 0.29554403\n", "Iteration 31, loss = 0.31277990\n", "Iteration 32, loss = 0.32660223\n", "Iteration 33, loss = 0.30875269\n", "Iteration 34, loss = 0.29677099\n", "Iteration 35, loss = 0.30462555\n", "Iteration 36, loss = 0.30065432\n", "Iteration 37, loss = 0.28055431\n", "Iteration 38, loss = 0.28216700\n", "Iteration 39, loss = 0.28394085\n", "Iteration 40, loss = 0.30976863\n", "Iteration 41, loss = 0.31419204\n", "Iteration 42, loss = 0.29560222\n", "Iteration 43, loss = 0.27613403\n", "Iteration 44, loss = 0.26569823\n", "Iteration 45, loss = 0.26684121\n", "Iteration 46, loss = 0.26594767\n", "Iteration 47, loss = 0.26708168\n", "Iteration 48, loss = 0.29411483\n", "Iteration 49, loss = 0.27883418\n", "Iteration 50, loss = 0.26005961\n", "Iteration 1, loss = 1.15953982\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 2, loss = 0.56632102\n", "Iteration 3, loss = 0.48196393\n", "Iteration 4, loss = 0.41749618\n", "Iteration 5, loss = 0.38656963\n", "Iteration 6, loss = 0.37464719\n", "Iteration 7, loss = 0.34993015\n", "Iteration 8, loss = 0.32449722\n", "Iteration 9, loss = 0.32141753\n", "Iteration 10, loss = 0.30603026\n", "Iteration 11, loss = 0.29595672\n", "Iteration 12, loss = 0.26898977\n", "Iteration 13, loss = 0.26634978\n", "Iteration 14, loss = 0.25530582\n", "Iteration 15, loss = 0.25209399\n", "Iteration 16, loss = 0.24151609\n", "Iteration 17, loss = 0.25650905\n", "Iteration 18, loss = 0.23438690\n", "Iteration 19, loss = 0.21976503\n", "Iteration 20, loss = 0.21112415\n", "Iteration 21, loss = 0.20575605\n", "Iteration 22, loss = 0.21161933\n", "Iteration 23, loss = 0.18771231\n", "Iteration 24, loss = 0.18814050\n", "Iteration 25, loss = 0.17546694\n", "Iteration 26, loss = 0.19530557\n", "Iteration 27, loss = 0.17760052\n", "Iteration 28, loss = 0.16650848\n", "Iteration 29, loss = 0.17099761\n", "Iteration 30, loss = 0.17989832\n", "Iteration 31, loss = 0.17657126\n", "Iteration 32, loss = 0.17333515\n", "Iteration 33, loss = 0.14826805\n", "Iteration 34, loss = 0.14217004\n", "Iteration 35, loss = 0.14885145\n", "Iteration 36, loss = 0.13295324\n", "Iteration 37, loss = 0.12930306\n", "Iteration 38, loss = 0.12355839\n", "Iteration 39, loss = 0.11600059\n", "Iteration 40, loss = 0.11961616\n", "Iteration 41, loss = 0.12397739\n", "Iteration 42, loss = 0.13486168\n", "Iteration 43, loss = 0.14797559\n", "Iteration 44, loss = 0.10134303\n", "Iteration 45, loss = 0.11306352\n", "Iteration 46, loss = 0.10687999\n", "Iteration 47, loss = 0.09611245\n", "Iteration 48, loss = 0.13456278\n", "Iteration 49, loss = 0.21421670\n", "Iteration 50, loss = 0.14621816\n", "Iteration 1, loss = 1.22681759\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 2, loss = 0.57400179\n", "Iteration 3, loss = 0.48418348\n", "Iteration 4, loss = 0.42596435\n", "Iteration 5, loss = 0.40048281\n", "Iteration 6, loss = 0.39244516\n", "Iteration 7, loss = 0.35358400\n", "Iteration 8, loss = 0.33211892\n", "Iteration 9, loss = 0.32478152\n", "Iteration 10, loss = 0.31416860\n", "Iteration 11, loss = 0.29627333\n", "Iteration 12, loss = 0.28722963\n", "Iteration 13, loss = 0.27924726\n", "Iteration 14, loss = 0.26671204\n", "Iteration 15, loss = 0.25894926\n", "Iteration 16, loss = 0.25058976\n", "Iteration 17, loss = 0.24308499\n", "Iteration 18, loss = 0.26029938\n", "Iteration 19, loss = 0.24197642\n", "Iteration 20, loss = 0.21538307\n", "Iteration 21, loss = 0.21821822\n", "Iteration 22, loss = 0.21002745\n", "Iteration 23, loss = 0.20351883\n", "Iteration 24, loss = 0.19432393\n", "Iteration 25, loss = 0.20700930\n", "Iteration 26, loss = 0.18643055\n", "Iteration 27, loss = 0.19737954\n", "Iteration 28, loss = 0.18157952\n", "Iteration 29, loss = 0.18573056\n", "Iteration 30, loss = 0.17085285\n", "Iteration 31, loss = 0.16228089\n", "Iteration 32, loss = 0.16192393\n", "Iteration 33, loss = 0.15994653\n", "Iteration 34, loss = 0.16228769\n", "Iteration 35, loss = 0.13679650\n", "Iteration 36, loss = 0.14010523\n", "Iteration 37, loss = 0.13464130\n", "Iteration 38, loss = 0.13712784\n", "Iteration 39, loss = 0.15749104\n", "Iteration 40, loss = 0.14673589\n", "Iteration 41, loss = 0.13541778\n", "Iteration 42, loss = 0.11960193\n", "Iteration 43, loss = 0.11715896\n", "Iteration 44, loss = 0.11622320\n", "Iteration 45, loss = 0.11892814\n", "Iteration 46, loss = 0.12764482\n", "Iteration 47, loss = 0.10196238\n", "Iteration 48, loss = 0.10466290\n", "Iteration 49, loss = 0.16800817\n", "Iteration 50, loss = 0.10778782\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.27442977\n", "Iteration 2, loss = 0.59334351\n", "Iteration 3, loss = 0.48517185\n", "Iteration 4, loss = 0.42785186\n", "Iteration 5, loss = 0.40944527\n", "Iteration 6, loss = 0.37956785\n", "Iteration 7, loss = 0.35450925\n", "Iteration 8, loss = 0.35173307\n", "Iteration 9, loss = 0.32688195\n", "Iteration 10, loss = 0.31983840\n", "Iteration 11, loss = 0.31884149\n", "Iteration 12, loss = 0.29888497\n", "Iteration 13, loss = 0.29311039\n", "Iteration 14, loss = 0.27587639\n", "Iteration 15, loss = 0.26616819\n", "Iteration 16, loss = 0.25673775\n", "Iteration 17, loss = 0.25699396\n", "Iteration 18, loss = 0.25647055\n", "Iteration 19, loss = 0.24336693\n", "Iteration 20, loss = 0.23636497\n", "Iteration 21, loss = 0.22575869\n", "Iteration 22, loss = 0.22341463\n", "Iteration 23, loss = 0.21042687\n", "Iteration 24, loss = 0.23134642\n", "Iteration 25, loss = 0.20460474\n", "Iteration 26, loss = 0.20431327\n", "Iteration 27, loss = 0.19825042\n", "Iteration 28, loss = 0.18874913\n", "Iteration 29, loss = 0.17352506\n", "Iteration 30, loss = 0.19318278\n", "Iteration 31, loss = 0.19682077\n", "Iteration 32, loss = 0.17450868\n", "Iteration 33, loss = 0.16746956\n", "Iteration 34, loss = 0.15930031\n", "Iteration 35, loss = 0.16410274\n", "Iteration 36, loss = 0.15284037\n", "Iteration 37, loss = 0.14985137\n", "Iteration 38, loss = 0.14194933\n", "Iteration 39, loss = 0.19264507\n", "Iteration 40, loss = 0.16190958\n", "Iteration 41, loss = 0.14229525\n", "Iteration 42, loss = 0.14224007\n", "Iteration 43, loss = 0.14130542\n", "Iteration 44, loss = 0.14719724\n", "Iteration 45, loss = 0.14294073\n", "Iteration 46, loss = 0.12056841\n", "Iteration 47, loss = 0.13546350\n", "Iteration 48, loss = 0.12867914\n", "Iteration 49, loss = 0.11479498\n", "Iteration 50, loss = 0.16617138\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.11538056\n", "Iteration 2, loss = 0.55035770\n", "Iteration 3, loss = 0.48005690\n", "Iteration 4, loss = 0.41490155\n", "Iteration 5, loss = 0.38592948\n", "Iteration 6, loss = 0.36138885\n", "Iteration 7, loss = 0.34769045\n", "Iteration 8, loss = 0.32641757\n", "Iteration 9, loss = 0.31005210\n", "Iteration 10, loss = 0.29148371\n", "Iteration 11, loss = 0.29213832\n", "Iteration 12, loss = 0.27479646\n", "Iteration 13, loss = 0.26050868\n", "Iteration 14, loss = 0.25638823\n", "Iteration 15, loss = 0.25821308\n", "Iteration 16, loss = 0.22843871\n", "Iteration 17, loss = 0.26086209\n", "Iteration 18, loss = 0.22830327\n", "Iteration 19, loss = 0.23725947\n", "Iteration 20, loss = 0.21445582\n", "Iteration 21, loss = 0.20485804\n", "Iteration 22, loss = 0.19438702\n", "Iteration 23, loss = 0.19551805\n", "Iteration 24, loss = 0.19816549\n", "Iteration 25, loss = 0.17714320\n", "Iteration 26, loss = 0.17580968\n", "Iteration 27, loss = 0.16259834\n", "Iteration 28, loss = 0.16105974\n", "Iteration 29, loss = 0.16104625\n", "Iteration 30, loss = 0.15613805\n", "Iteration 31, loss = 0.15466963\n", "Iteration 32, loss = 0.15942157\n", "Iteration 33, loss = 0.14615678\n", "Iteration 34, loss = 0.16803248\n", "Iteration 35, loss = 0.13871575\n", "Iteration 36, loss = 0.13766299\n", "Iteration 37, loss = 0.12135482\n", "Iteration 38, loss = 0.13310634\n", "Iteration 39, loss = 0.14294225\n", "Iteration 40, loss = 0.15625352\n", "Iteration 41, loss = 0.14622816\n", "Iteration 42, loss = 0.11501413\n", "Iteration 43, loss = 0.10127007\n", "Iteration 44, loss = 0.10505904\n", "Iteration 45, loss = 0.10453928\n", "Iteration 46, loss = 0.11700670\n", "Iteration 47, loss = 0.09566447\n", "Iteration 48, loss = 0.12035684\n", "Iteration 49, loss = 0.08717201\n", "Iteration 50, loss = 0.08452620\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.03082416\n", "Iteration 2, loss = 0.55992167\n", "Iteration 3, loss = 0.48646284\n", "Iteration 4, loss = 0.42393545\n", "Iteration 5, loss = 0.39132982\n", "Iteration 6, loss = 0.36758175\n", "Iteration 7, loss = 0.34755549\n", "Iteration 8, loss = 0.33545037\n", "Iteration 9, loss = 0.32215803\n", "Iteration 10, loss = 0.29800005\n", "Iteration 11, loss = 0.29564909\n", "Iteration 12, loss = 0.28186965\n", "Iteration 13, loss = 0.25744827\n", "Iteration 14, loss = 0.26045685\n", "Iteration 15, loss = 0.25319811\n", "Iteration 16, loss = 0.23052456\n", "Iteration 17, loss = 0.23379113\n", "Iteration 18, loss = 0.21781193\n", "Iteration 19, loss = 0.20544571\n", "Iteration 20, loss = 0.21762216\n", "Iteration 21, loss = 0.20694070\n", "Iteration 22, loss = 0.19541282\n", "Iteration 23, loss = 0.20108694\n", "Iteration 24, loss = 0.18310905\n", "Iteration 25, loss = 0.18218900\n", "Iteration 26, loss = 0.17781469\n", "Iteration 27, loss = 0.16438105\n", "Iteration 28, loss = 0.19216311\n", "Iteration 29, loss = 0.16776784\n", "Iteration 30, loss = 0.18089437\n", "Iteration 31, loss = 0.15785309\n", "Iteration 32, loss = 0.14732536\n", "Iteration 33, loss = 0.14220030\n", "Iteration 34, loss = 0.13937829\n", "Iteration 35, loss = 0.12862834\n", "Iteration 36, loss = 0.14192243\n", "Iteration 37, loss = 0.12642765\n", "Iteration 38, loss = 0.11410477\n", "Iteration 39, loss = 0.11255568\n", "Iteration 40, loss = 0.11933455\n", "Iteration 41, loss = 0.11107101\n", "Iteration 42, loss = 0.10971539\n", "Iteration 43, loss = 0.09388885\n", "Iteration 44, loss = 0.12094420\n", "Iteration 45, loss = 0.11440281\n", "Iteration 46, loss = 0.10210654\n", "Iteration 47, loss = 0.09528841\n", "Iteration 48, loss = 0.09160486\n", "Iteration 49, loss = 0.07993320\n", "Iteration 50, loss = 0.07200011\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 0.88319614\n", "Iteration 2, loss = 0.50793429\n", "Iteration 3, loss = 0.46196928\n", "Iteration 4, loss = 0.42098273\n", "Iteration 5, loss = 0.38674721\n", "Iteration 6, loss = 0.36751516\n", "Iteration 7, loss = 0.34335619\n", "Iteration 8, loss = 0.32552195\n", "Iteration 9, loss = 0.32341516\n", "Iteration 10, loss = 0.30618748\n", "Iteration 11, loss = 0.29888448\n", "Iteration 12, loss = 0.28812727\n", "Iteration 13, loss = 0.28183777\n", "Iteration 14, loss = 0.27011055\n", "Iteration 15, loss = 0.26127752\n", "Iteration 16, loss = 0.25157114\n", "Iteration 17, loss = 0.25552899\n", "Iteration 18, loss = 0.23619496\n", "Iteration 19, loss = 0.23859845\n", "Iteration 20, loss = 0.23047925\n", "Iteration 21, loss = 0.21871712\n", "Iteration 22, loss = 0.21462198\n", "Iteration 23, loss = 0.21084839\n", "Iteration 24, loss = 0.21433333\n", "Iteration 25, loss = 0.19576466\n", "Iteration 26, loss = 0.20028233\n", "Iteration 27, loss = 0.18297649\n", "Iteration 28, loss = 0.19082379\n", "Iteration 29, loss = 0.18536914\n", "Iteration 30, loss = 0.17883075\n", "Iteration 31, loss = 0.17432974\n", "Iteration 32, loss = 0.16349988\n", "Iteration 33, loss = 0.16660378\n", "Iteration 34, loss = 0.16455106\n", "Iteration 35, loss = 0.15799562\n", "Iteration 36, loss = 0.16193027\n", "Iteration 37, loss = 0.15305652\n", "Iteration 38, loss = 0.14825680\n", "Iteration 39, loss = 0.15209548\n", "Iteration 40, loss = 0.14582625\n", "Iteration 41, loss = 0.14437471\n", "Iteration 42, loss = 0.12871896\n", "Iteration 43, loss = 0.13979650\n", "Iteration 44, loss = 0.13506464\n", "Iteration 45, loss = 0.11998005\n", "Iteration 46, loss = 0.14388038\n", "Iteration 47, loss = 0.13242641\n", "Iteration 48, loss = 0.11844929\n", "Iteration 49, loss = 0.10345062\n", "Iteration 50, loss = 0.10845943\n", "RESULTS FOR NN\n", "\n", "Best parameters set found:\n", "{'hidden_layer_sizes': (50,)}\n", "Score with best parameters:\n", "0.8472000000000002\n", "\n", "All scores on the grid:\n", "{'mean_fit_time': array([3.48991451, 5.10629439, 3.45933599, 5.89379749]), 'std_fit_time': array([0.14558508, 0.07054004, 0.42420765, 0.07141195]), 'mean_score_time': array([0.00848894, 0.0111835 , 0.00686765, 0.01117058]), 'std_score_time': array([0.00544379, 0.00640661, 0.0050469 , 0.0061501 ]), 'param_hidden_layer_sizes': masked_array(data=[(10,), (50,), (10, 10), (50, 50)],\n", " mask=[False, False, False, False],\n", " fill_value='?',\n", " dtype=object), 'params': [{'hidden_layer_sizes': (10,)}, {'hidden_layer_sizes': (50,)}, {'hidden_layer_sizes': (10, 10)}, {'hidden_layer_sizes': (50, 50)}], 'split0_test_score': array([0.8195, 0.8435, 0.798 , 0.8445]), 'split1_test_score': array([0.8315, 0.8445, 0.8145, 0.8375]), 'split2_test_score': array([0.8185, 0.8445, 0.8195, 0.846 ]), 'split3_test_score': array([0.788 , 0.845 , 0.7625, 0.849 ]), 'split4_test_score': array([0.822 , 0.8585, 0.8415, 0.856 ]), 'mean_test_score': array([0.8159, 0.8472, 0.8072, 0.8466]), 'std_test_score': array([0.01468809, 0.00567098, 0.02632033, 0.00602827]), 'rank_test_score': array([3, 1, 4, 2])}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "A:\\InstallationDriver\\Anaconda\\anaconda3\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:614: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.\n", " warnings.warn(\n" ] } ], "source": [ "#for NN we try the same architectures as before\n", "hl_parameters = {'hidden_layer_sizes': [(10,), (50,), (10,10,), (50,50,)]}\n", "\n", "mlp_large_cv = MLPClassifier(max_iter=50, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, random_state=ID, verbose=True)\n", "GS_CV = GridSearchCV(mlp_large_cv, hl_parameters, cv=5, verbose=True)\n", "GS_CV.fit(X_train, y_train)\n", "\n", "\n", "print ('RESULTS FOR NN\\n')\n", "\n", "print(\"Best parameters set found:\")\n", "print(GS_CV.best_params_)\n", "\n", "print(\"Score with best parameters:\")\n", "print(GS_CV.best_score_)\n", "\n", "print(\"\\nAll scores on the grid:\")\n", "print(GS_CV.cv_results_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 5\n", "Describe your architecture choices and the results you observe with respect to the architectures you used." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The score for 10 neurons and 1 hidden layer is 0.8159\n", "The score for 50 neurons and 1 hidden layer is 0.8472\n", "The score for 10 neurons and 2 hidden layer is 0.8072\n", "The score for 50 neurons and 2 hidden layer is 0.8466" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 6\n", "\n", "Get the train and test error for the best NN you obtained with 10000 points. This time you can run for 100 iterations if you cannot run for 300 iterations. \n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "RESULTS FOR BEST NN\n", "\n", "Best NN training error: 0.000000\n", "Best NN test error: 0.212403\n" ] } ], "source": [ "#get training and test error for the best NN model from CV\n", "\n", "best_mlp_large = MLPClassifier(max_iter=300, alpha=1e-4, solver='sgd', tol=1e-4, learning_rate_init=.1, random_state=ID,hidden_layer_sizes=(50,50,))\n", "\n", "best_mlp_large.fit(X_train, y_train)\n", "\n", "training_error_large = 1. - best_mlp_large.score(X_train, y_train)\n", "\n", "test_error_large = 1. - best_mlp_large.score(X_test, y_test)\n", "\n", "print ('RESULTS FOR BEST NN\\n')\n", "\n", "print (\"Best NN training error: %f\" % training_error)\n", "print (\"Best NN test error: %f\" % test_error)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 7\n", "\n", "Compare the train and test error you got with a large number of samples with the best one you obtained with only 500 data points. Are the architectures the same or do they differ? What about the errors you get?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The two architectures are different with respect to the number of samples in the training set:\n", "\n", "1 hidden layer and 50 neurons for 500 data points.\n", "1 hidden layers and 50 neurons for each layer for 10000 data points.\n", "\n", "The Result of the 10000 data points is better than the 500 data points." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 8\n", "\n", "Plot an image that was missclassified by NN with m=500 training data points and it is now instead correctly classified by NN with m=10000 training data points." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INPUT:\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAASBElEQVR4nO3dbWxU55UH8P/hLQFswMbGtoyzhgblXYFqhFZiRbJqAiRfSBWxKh8qqqBSKYlUon7YKPuh+Rittq36YVOJblDpqpuqUokgUrIlIkUJUgIxgRBY8gqmMRhs82rAQMBnP/imMuB7zmTuzNyB8/9Jlu05fmYej/lz7Tn3uY+oKojo1jcu7wkQUXUw7ERBMOxEQTDsREEw7ERBTKjmgzU1NWlnZ2c1H/KWcPHiRbN++fLlku97eHg4031PmTLFrA8ODqbWmpqazLETJ04063Sj7u5uDAwMyFi1TGEXkWUAfg1gPID/UtWXrK/v7OxEV1dXloesSV5gxo3L9gvUZ599Zta7u7tTayJj/tz/7sKFC2b9q6++MusPPvigWd+2bVtq7amnnjLHtre3m3W6UaFQSK2V/K9QRMYD+E8AjwG4F8BKEbm31PsjosrKcshZCOALVT2oqpcB/BHA8vJMi4jKLUvY2wGM/h2vJ7ntGiKyRkS6RKSrv78/w8MRURZZwj7WH4M3nHurqutUtaCqhebm5gwPR0RZZAl7D4COUZ/PBnA023SIqFKyhP0DAPNEZI6ITALwAwCbyzMtIiq3kltvqnpFRJ4F8BeMtN7Wq+r+ss3sJlLplYNr164164cOHSr5vidMsP8JeL1uq48OAJcuXUqtvf322+bYjRs3mvWGhgazfvXq1dTa+PHjzbG3okx9dlV9A8AbZZoLEVUQT5clCoJhJwqCYScKgmEnCoJhJwqCYScKoqrr2Wls3jLT8+fPm/VZs2al1vbu3WuOHRoaMuveElmrjw7Y5yDMnDnTHHvq1Cmz7vXZeeXka/HIThQEw04UBMNOFATDThQEw04UBMNOFARbb2Vw5coVs+4tp/Raa94y0gceeCC15l26+7bbbjPrb775plm/7777zLp1qelp06aZYwcGBsz63LlzzTpbb9fikZ0oCIadKAiGnSgIhp0oCIadKAiGnSgIhp0oCPbZy8DrVXu8bY/r6+vN+rFjx1Jr3i6sK1asMOtz5swx60uXLjXrW7duTa1Zu88CwHvvvWfWFy5caNa55fO1eGQnCoJhJwqCYScKgmEnCoJhJwqCYScKgmEnCoJ99oS39tm6pPLly5fNsa+//rpZ37Rpk1kfHh426+3t7am15uZmc+zu3bvN+rhx9vFg586dZn3y5MmpNW9uu3btMusvv/yyWV+8eHFq7f777zfH3ooyhV1EugEMArgK4IqqFsoxKSIqv3Ic2f9ZVe1LihBR7vg3O1EQWcOuALaIyC4RWTPWF4jIGhHpEpGu/v7+jA9HRKXKGvZFqvpdAI8BeEZEbnhFRFXXqWpBVQveCzJEVDmZwq6qR5P3fQBeA2AvQyKi3JQcdhGZKiL133wMYAmAfeWaGBGVV5ZX41sAvJb0nycA+B9V/d+yzCoH3tbE1rbKTz75pDn2jjvuMOveevhJkyaZdavPP2GC/SP2vu/GxkazfubMGbPe2tqaWvPOT/C+708++cSsb9myJbX20EMPmWOfe+45s57lvIy8lBx2VT0I4MEyzoWIKoitN6IgGHaiIBh2oiAYdqIgGHaiILjEtUjWMlWrvQT4WzZ7Ojo6zLq1DPXixYvmWK/uXY65rq7OrFvLcxsaGsyx3lbYXntrwYIFqbXe3l5zbNa2YC3ikZ0oCIadKAiGnSgIhp0oCIadKAiGnSgIhp0oCPbZi2Rdcvn22283x546dcqst7W1mXXv/q2esLcU0zsHYGhoyKw3NTWVfP/eZaq9Hr53DoD1vHuP/f7775t16zLVtYpHdqIgGHaiIBh2oiAYdqIgGHaiIBh2oiAYdqIg2Gcv0uHDh1NrLS0t5tisa8pnzJhh1r/++uvUmrfm21uL76379vrVly5dSq15z5u13TNgf98AcPXq1dSad37BgQMHzDr77ERUsxh2oiAYdqIgGHaiIBh2oiAYdqIgGHaiINhnT3z00Udm/fTp06m1efPmmWO9Pnp/f79Zb25uNuvWum9r3oDfw7/zzjvNunf/1nbU9fX15ljv2u1Hjhwx69Z21d41Ao4fP27Wb0bukV1E1otIn4jsG3Vbo4i8JSKfJ+/tq/0TUe6K+TX+dwCWXXfb8wC2quo8AFuTz4mohrlhV9V3AJy87ublADYkH28A8ER5p0VE5VbqC3QtqtoLAMn7WWlfKCJrRKRLRLq8v02JqHIq/mq8qq5T1YKqFrwXmoiockoN+3ERaQOA5H1f+aZERJVQatg3A1iVfLwKwKbyTIeIKsXts4vIqwAeBtAkIj0Afg7gJQB/EpHVAP4GYEUlJ1kNXl/V6kd7+4h7vWiv55uFt97cOwdg5syZZt3r0x89ejS15l333eqTA/ZaecD+uXh7wx87dsys34zcsKvqypTS98o8FyKqIJ4uSxQEw04UBMNOFATDThQEw04UBJe4JqzLDgN+m8hy5swZs97R0WHWz507Z9atuc2ZM8cc633fAwMDZt27FHVPT09qbXBw0Bzb2Nho1r22orVls9dS9C5T7bVbvbZhHnhkJwqCYScKgmEnCoJhJwqCYScKgmEnCoJhJwqi9pqBOfEumWX1Tb2e7D333GPWh4aGMtVV1axbvF61t+WzZ/78+ak1r1ft9eG9Kx9Z5ydk2e4ZsHv4gD+3PPDIThQEw04UBMNOFATDThQEw04UBMNOFATDThQE++yJEydOmHVr6+Hh4WFzbKFQMOvbtm0z697lnq317N62x966a68X7s3Nun+vl+2t48/Sy/b65N75BSdPXr/94bXYZyei3DDsREEw7ERBMOxEQTDsREEw7ERBMOxEQbDPnjh//rxZt3rZXp/d29Y4S68asHvpdXV1me7bWyvv1a3799bSe+vZva2u29raUmuffvqpOda73r53/YO77rrLrOfBPbKLyHoR6RORfaNue1FEjojInuTt8cpOk4iyKubX+N8BWDbG7b9S1fnJ2xvlnRYRlZsbdlV9B4B9biAR1bwsL9A9KyJ7k1/zG9K+SETWiEiXiHR5f+cQUeWUGvbfAPgOgPkAegH8Iu0LVXWdqhZUtVCLiwOIoigp7Kp6XFWvquowgN8CWFjeaRFRuZUUdhEZ3dP4PoB9aV9LRLXB7bOLyKsAHgbQJCI9AH4O4GERmQ9AAXQD+Enlplgd3rptqyfsrcuur6836+PHj89Unzx5cmrNuz66t++8t67bG289r16ffOrUqWbde96nT59e0ryKkXV8Htywq+rKMW5+pQJzIaIK4umyREEw7ERBMOxEQTDsREEw7ERBcIlrwmvjWK0373LNniztK8BeCjpt2jRzrPd9e0tYvWWq1nOTdbtob1mytcTV+7685yXrzzwPPLITBcGwEwXBsBMFwbATBcGwEwXBsBMFwbATBcE+e8JbCmotM7W2c/bGAsCFCxfMunf/Vj/aG+v1uj1eL9zqZ3tbMk+aNMmsnz171qxb35s3b+9nNjQ0ZNZrEY/sREEw7ERBMOxEQTDsREEw7ERBMOxEQTDsREGwz57w1i9bdetSzoC/9bC3nj3LtsonTpwwx3qXa/Z42017/ewsY71zI/btS9/OwLuMtYd9diKqWQw7URAMO1EQDDtREAw7URAMO1EQDDtREOyzJ7ye7vDwcGrNW3c9ZcoUs+6tObceG7B7vt59e7zzD7xzBKw+vDfWW2vv9cqt5917bM/NuGWze2QXkQ4R+auIHBCR/SLy0+T2RhF5S0Q+T943VH66RFSqYn6NvwLgZ6p6D4B/BPCMiNwL4HkAW1V1HoCtyedEVKPcsKtqr6p+mHw8COAAgHYAywFsSL5sA4AnKjRHIiqDb/UCnYh0AlgAYAeAFlXtBUb+QwAwK2XMGhHpEpGu/v7+jNMlolIVHXYRqQPwZwBrVdW+0t8oqrpOVQuqWmhubi5ljkRUBkWFXUQmYiTof1DVjcnNx0WkLam3AeirzBSJqBzc1puM9KReAXBAVX85qrQZwCoALyXvN1VkhlXitVKs9pfXIpo+fbpZ97b/9ZZTWuOztpi8ZaTe3Kzlv97S3yw/EwBobW1Nre3evdscm+US2bWqmD77IgA/BPCxiOxJbnsBIyH/k4isBvA3ACsqMkMiKgs37Kq6HUDaf3PfK+90iKhSeLosURAMO1EQDDtREAw7URAMO1EQXOKa8JZyWrxe9s6dO8363LlzzXpHR4dZt/r8AwMD5lhv+a3XT85yqWjv+/bu+/Tp02bd6uN7WzJ7Pf5bcokrEd0aGHaiIBh2oiAYdqIgGHaiIBh2oiAYdqIg2GdPeGvKrb5qY2OjOXb//v1m3dpaGAAOHjxo1q1+cltbmzn27rvvNuvelsx1dXVm/fDhw6m17du3m2NbWlrMuve8rV69OrXmbVV96dIls+6t869FPLITBcGwEwXBsBMFwbATBcGwEwXBsBMFwbATBcE+e8Lb/tdaW+1dF95z4cIFs+6tKbfGz5492xzrXXvd6zd76+GtdePeenSvl11fX2/WrfMfvLHe9fC977sW8chOFATDThQEw04UBMNOFATDThQEw04UBMNOFEQx+7N3APg9gFYAwwDWqeqvReRFAD8G0J986Quq+kalJlppX375pVm31mV3dnaaY999912z7l2z3us3W71wbw/08+fPm/VDhw6Zda+Pf/LkydTa2bNnzbHeNQbOnDlj1q259/T0mGP37Nlj1hcvXmzWa1ExJ9VcAfAzVf1QROoB7BKRt5Lar1T1Pyo3PSIql2L2Z+8F0Jt8PCgiBwC0V3piRFRe3+pvdhHpBLAAwI7kpmdFZK+IrBeRhpQxa0SkS0S6+vv7x/oSIqqCosMuInUA/gxgraqeBfAbAN8BMB8jR/5fjDVOVdepakFVC83NzdlnTEQlKSrsIjIRI0H/g6puBABVPa6qV1V1GMBvASys3DSJKCs37DKy3OsVAAdU9Zejbh992dLvA7Av9UlEuSrm1fhFAH4I4GMR2ZPc9gKAlSIyH4AC6AbwkwrMr2q85ZZWq8ZqLwHAjh07zLrX5vEu12y1z6ZNm2aOnTFjhll/+umnzbrXFjx16lRqzXvOGxrGfBno77y24ZIlS1Jrjz76qDm2r6/PrB85csSs16JiXo3fDmCsxdw3bU+dKCKeQUcUBMNOFATDThQEw04UBMNOFATDThQELyWdeOSRR8y6dcnlBQsWmGNbW1vN+rJly8w6ld+iRYvMureV9dKlS8s5nargkZ0oCIadKAiGnSgIhp0oCIadKAiGnSgIhp0oCPG2Ay7rg4n0Axh9TeYmAANVm8C3U6tzq9V5AZxbqco5t39Q1TGv/1bVsN/w4CJdqlrIbQKGWp1brc4L4NxKVa258dd4oiAYdqIg8g77upwf31Krc6vVeQGcW6mqMrdc/2YnourJ+8hORFXCsBMFkUvYRWSZiHwqIl+IyPN5zCGNiHSLyMciskdEunKey3oR6RORfaNuaxSRt0Tk8+S9fXH16s7tRRE5kjx3e0Tk8Zzm1iEifxWRAyKyX0R+mtye63NnzKsqz1vV/2YXkfEAPgPwKIAeAB8AWKmq/1fViaQQkW4ABVXN/QQMEVkM4ByA36vq/clt/w7gpKq+lPxH2aCq/1ojc3sRwLm8t/FOditqG73NOIAnAPwIOT53xrz+BVV43vI4si8E8IWqHlTVywD+CGB5DvOoear6DoDrt5tZDmBD8vEGjPxjqbqUudUEVe1V1Q+TjwcBfLPNeK7PnTGvqsgj7O0Avhr1eQ9qa793BbBFRHaJyJq8JzOGFlXtBUb+8QCYlfN8rudu411N120zXjPPXSnbn2eVR9jH2kqqlvp/i1T1uwAeA/BM8usqFaeobbyrZYxtxmtCqdufZ5VH2HsAdIz6fDaAoznMY0yqejR53wfgNdTeVtTHv9lBN3lv70BYRbW0jfdY24yjBp67PLc/zyPsHwCYJyJzRGQSgB8A2JzDPG4gIlOTF04gIlMBLEHtbUW9GcCq5ONVADblOJdr1Mo23mnbjCPn5y737c9VtepvAB7HyCvyXwL4tzzmkDKvuQA+St725z03AK9i5Ne6rzHyG9FqADMBbAXwefK+sYbm9t8APgawFyPBastpbv+EkT8N9wLYk7w9nvdzZ8yrKs8bT5clCoJn0BEFwbATBcGwEwXBsBMFwbATBcGwEwXBsBMF8f95dSGGNi/b5QAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "LABEL: 4\n", "For 500 samples: 2\n", "For 10000 samples: 4 \n" ] } ], "source": [ "for i in range(y_test.size):\n", " yp = mlp.predict(X_test)\n", " ypl = best_mlp_large.predict(X_test)\n", " if yp[i] != y_test[i] and ypl[i] == y_test[i]:\n", " plot_input(X_test,y_test,i)\n", " print('For 500 samples: %s' % yp[i])\n", " print('For 10000 samples: %s ' % ypl[i])\n", " break \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot some of the weigths of the multi-layer perceptron classifier, for the best NN we get with 500 data points and with 10000 data points. The code below plots the weights in a matrix form, where a figure represents all the weights of the edges entering in a hidden node. Notice that the code assumes that the NNs are called \"mlp\" and \"best_mlp_large\": you may need to replace such variables with your variable names. \n", "\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Weights with 500 data points:\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Weights with 10000 data points:\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "print(\"Weights with 500 data points:\")\n", "\n", "fig, axes = plt.subplots(4, 4)\n", "vmin, vmax = mlp.coefs_[0].min(), mlp.coefs_[0].max()\n", "for coef, ax in zip(mlp.coefs_[0].T, axes.ravel()):\n", " ax.matshow(coef.reshape(28, 28), cmap=plt.cm.gray, vmin=.5 * vmin, vmax=.5 * vmax)\n", " ax.set_xticks(())\n", " ax.set_yticks(())\n", "\n", "plt.show()\n", "\n", "print(\"Weights with 10000 data points:\")\n", "\n", "fig, axes = plt.subplots(4, 4)\n", "vmin, vmax = best_mlp_large.coefs_[0].min(), best_mlp_large.coefs_[0].max()\n", "for coef, ax in zip(best_mlp_large.coefs_[0].T, axes.ravel()):\n", " ax.matshow(coef.reshape(28, 28), cmap=plt.cm.gray, vmin=.5 * vmin, vmax=.5 * vmax)\n", " ax.set_xticks(())\n", " ax.set_yticks(())\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 9\n", "\n", "Describe what do you observe by looking at the weights.\n", "\n", "The images are in 10000 data points are more clear and less noise compared with the 500 data point images. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 10\n", "\n", "Pick another classifier among the ones we have seen previously (SVM or something else). Report the training and test error for such classifier with 10000 samples in the training set, if possible; if the classifier cannot run with so many data sample reduce the number of samples.\n", "\n", "*Note*: if there are parameters to be optimized use cross-validation. If you choose SVM, you can decide if you want to use a single kernel or use the best among many; in the latter case, you need to pick the best kernel using cross-validation (using the functions available in sklearn).\n", "\n", "CLASSIFIER CHOOSEN: SVM with rbf kernel" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "RESULTS FOR OTHER CLASSIFIER\n", "\n", "Best training error (other model): 0.016700\n", "Best test error (other model): 0.122260\n" ] } ], "source": [ "from sklearn.svm import SVC\n", "from sklearn.model_selection import GridSearchCV\n", "SVM = SVC(kernel='rbf')\n", "parameters = {'C': [1, 10, 100]}\n", "g = GridSearchCV(SVM,parameters,scoring='accuracy')\n", "g.fit(X_train,y_train)\n", "\n", "training_error_other = 1. - g.score(X_train,y_train)\n", "\n", "test_error_other = 1. - g.score(X_test,y_test)\n", "\n", "print ('RESULTS FOR OTHER CLASSIFIER\\n')\n", "\n", "print (\"Best training error (other model): %f\" % training_error_other)\n", "print (\"Best test error (other model): %f\" % test_error_other)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 11\n", "Compare the results of NN and of the other classifier you have chosen above. Which classifier would you preferer? Provide a brief explanation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With minimum test error is the Neural Networks with 10000 samples, 1 hidden layer and 50 neurons for each. So this architecture is good." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Clustering with K-means\n", "\n", "Clustering is a useful technique for *unsupervised* learning. We are now going to cluster 2000 images in the fashion MNIST dataset, and try to understand if the clusters we obtain correspond to the true labels." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "#load the required packages\n", "\n", "from sklearn import metrics\n", "from sklearn.cluster import KMeans" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(Note that the code below assumes that the data has already been transformed as in the NN part of the notebook, so make sure to run the code for the transformation even if you do not complete the part on NN.)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "#let's consider only 2000 data points\n", "\n", "X = X[permutation]\n", "y = y[permutation]\n", "\n", "m_training = 2000\n", "\n", "X_train, X_test = X[:m_training], X[m_training:]\n", "y_train, y_test = y[:m_training], y[m_training:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 1\n", "Cluster the points using the KMeans() and fit() functions (see the userguide for details). For Kmeans, set: n_clusters=10 as number of clusters; n_init=10 as the number of times the algorithm will be run with different centroid seeds; random_state = ID. You can use the default setting for the other parameters." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "KMeans(n_clusters=10, random_state=2041267)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kmeans = KMeans(n_clusters=10, n_init=10, random_state = ID)\n", "kmeans.fit(X_train,y_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Comparison of clusters with true labels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 2\n", "Now compare the obtained clusters with the true labels, using the function sklearn.metrics.cluster.contingency_matrix() (see the userguide for details). The function prints a matrix $A$ such that entry $A_{i,j}$ is is the number of samples in true class $i$ and in predicted class $j$." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0 1768 3158 52 24 0 486 147 3 182]\n", " [ 0 341 218 5016 3 0 117 51 0 48]\n", " [ 0 136 113 3 24 0 695 3068 1 1777]\n", " [ 0 1835 1626 1962 4 0 242 31 0 95]\n", " [ 0 297 904 97 15 0 291 3000 0 1205]\n", " [ 452 22 2 0 6 205 3603 0 1497 8]\n", " [ 0 747 976 15 59 1 917 1618 4 1458]\n", " [ 744 0 0 0 3 21 449 0 4571 0]\n", " [ 49 418 24 3 2571 4 435 271 224 1800]\n", " [3043 23 3 0 4 2365 176 0 161 13]]\n" ] } ], "source": [ "# compute and print the contingency matrix for the true labels vs the clustering assignments\n", "y_predict = kmeans.predict(X_test)\n", "mat = sklearn.metrics.cluster.contingency_matrix(y_test, y_predict)\n", "print(mat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 3\n", "Based on the matrix shown above, comment on the results of clustering in terms of adherence to the true labels.\n", "\n", "[ADD YOUR ANSWER HERE]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Choice of k with silhoutte coefficient\n", "In many real applications it is unclear what is the correct value of $k$ to use. In practice one tries different values of $k$ and then uses some external score to choose a value of $k$. One such score is the silhoutte coefficient, that can be computed with metrics.silhouette_score(). See the definition of the silhoutte coefficient in the userguide." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 4\n", "Compute the clustering for k=2,3,...,15 (other parameters as above) and print the silhoutte coefficient for each such clustering." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Silhoutte coefficient for number of clusters=2: 0.19317571670652278\n", "Silhoutte coefficient for number of clusters=3: 0.18433380720956222\n", "Silhoutte coefficient for number of clusters=4: 0.18042672896786996\n", "Silhoutte coefficient for number of clusters=5: 0.1603201984337485\n", "Silhoutte coefficient for number of clusters=6: 0.15622936969582776\n", "Silhoutte coefficient for number of clusters=7: 0.1667462399786566\n", "Silhoutte coefficient for number of clusters=8: 0.16034511984466143\n", "Silhoutte coefficient for number of clusters=9: 0.1536777083218814\n", "Silhoutte coefficient for number of clusters=10: 0.13566561580397163\n", "Silhoutte coefficient for number of clusters=11: 0.1277787383740401\n", "Silhoutte coefficient for number of clusters=12: 0.1349591125810742\n", "Silhoutte coefficient for number of clusters=13: 0.1335592010143823\n", "Silhoutte coefficient for number of clusters=14: 0.13101268489187756\n", "Silhoutte coefficient for number of clusters=15: 0.12965821137618527\n" ] } ], "source": [ "#run k-means with 10 choices of initial centroids for a range of values of n_clusters\n", "\n", "for i in range(2,16):\n", " kmeans = KMeans(n_clusters=i, n_init=10, random_state = ID)\n", " kmeans.fit(X_train,y_train)\n", " silhouttescore = metrics.silhouette_score(X_test, kmeans.predict(X_test))\n", " print(\"Silhoutte coefficient for number of clusters=\"+str(i)+\": \"+str(silhouttescore))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TO DO 5\n", "\n", "Based on the silhoutte score, which $k$ would you pick? Motivate your choice. Does your choice match what you know about the data? If yes, explain why you think this is the case; if no, explain what you think may be the reason." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 4 }