{ "cells": [ { "cell_type": "markdown", "id": "3fc274aa-b43d-45e1-87c0-340408efdc9b", "metadata": {}, "source": [ "# Classifying data with Neural Networks\n", "\n", "In this notebook we will see how we can classify data with a neural network. We will use the famous IRIS dataset and train our network to predict the species of an iris flower based on four features of that flower.\n", "\n", "## The IRIS dataset\n", "\n", "The IRIS dataset is a well-known and frequently used dataset in the field of machine learning and statistics. It is often used as a benchmark for classification tasks. The dataset is named after the iris flower, as it contains measurements of various attributes of three different species of iris flowers.\n", "\n", "The IRIS dataset consists of ***150 samples***, with each sample representing an individual iris flower. Each flower sample is described by ***four features*** or attributes:\n", "\n", "1. **Sepal length**: It represents the length of the sepal, which is the outermost whorl of the flower. It is measured in centimeters.\n", "2. **Sepal width**: It denotes the width of the sepal, measured in centimeters.\n", "3. **Petal length**: It represents the length of the petal, which is the innermost whorl of the flower. It is measured in centimeters.\n", "4. **Petal width**: It denotes the width of the petal, measured in centimeters.\n", "\n", "Based on these four features, the IRIS dataset aims to classify each iris flower into one of ***three species***:\n", "\n", "1. **Setosa**: Iris setosa is one of the species of iris flowers. It is known for its distinctive appearance, with relatively small sepal and petal sizes.\n", "2. **Versicolor**: Iris versicolor is another species in the iris family. It has intermediate sepal and petal sizes compared to the other two species.\n", "3. **Virginica**: Iris virginica is the third species in the dataset. It typically has the largest sepal and petal sizes among the three species.\n", "\n", "The IRIS dataset is widely used for tasks such as classification, clustering, and data visualization. Its simplicity, small size, and well-defined class labels make it an ideal starting point for exploring and evaluating various machine learning algorithms and techniques." ] }, { "cell_type": "code", "execution_count": 1, "id": "cf289899-758c-4df3-ab5f-20b4692a1aaa", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "\n", "import torch\n", "import torch.nn as nn\n", "import torch.optim as optim\n", "from torch.utils.data import Dataset\n", "from torch.utils.data import DataLoader\n", "\n", "from sklearn.datasets import load_iris\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.preprocessing import StandardScaler" ] }, { "cell_type": "code", "execution_count": 2, "id": "688bc589-8167-45e5-8d42-2812ff4d01e5", "metadata": {}, "outputs": [], "source": [ "# First we will load the iris dataset. This dataset contains measurements of different flower types\n", "# like the sepal length, the sepal width, petal length length and petal width\n", "iris = load_iris(as_frame=True)\n", "X = iris['data']\n", "y = iris['target']" ] }, { "cell_type": "code", "execution_count": 3, "id": "b3da6a88-95af-43dc-9965-fd8a9692a304", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2
24.73.21.30.2
34.63.11.50.2
45.03.61.40.2
\n", "
" ], "text/plain": [ " sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)\n", "0 5.1 3.5 1.4 0.2\n", "1 4.9 3.0 1.4 0.2\n", "2 4.7 3.2 1.3 0.2\n", "3 4.6 3.1 1.5 0.2\n", "4 5.0 3.6 1.4 0.2" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's have a look at the four features of the dataset\n", "X.head()" ] }, { "cell_type": "code", "execution_count": 4, "id": "a4c1e4e0-4113-4b6f-a6ff-422c4c940ba4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 0\n", "1 0\n", "2 0\n", "3 0\n", "4 0\n", "Name: target, dtype: int32" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# And now let's have a look at the labels\n", "y.head()" ] }, { "cell_type": "code", "execution_count": 5, "id": "2778f942-80d9-4d9f-899b-cce4aeca54a6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(array([0, 1, 2]), array([50, 50, 50], dtype=int64))" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's have a look at how many samples of each species are in the dataset\n", "np.unique(y, return_counts=True)" ] }, { "cell_type": "markdown", "id": "4ebdc70b-faae-4502-b896-39313a919f52", "metadata": {}, "source": [ "So there are 50 samples of each species in the dataset. The species are encoded by the numbers 0, 1 and 2. Now lets build a dataset class for the IRIS dataset." ] }, { "cell_type": "markdown", "id": "9859a946-244b-4158-a747-d0feb61193f0", "metadata": {}, "source": [ "## Feature Engineering\n", "For training neural networks as well as for other machine learning algorithms it is important to standardize the features of our dataset. That means shrinking the values to a range between 0 and 1 (MinMaxScaler) or mapping the features to a standard normal distribution (StandardScaler).\n", "\n", "The library SciKit Learn offers you a lot of scaling techniques already implemented in the `sklearn.preprossesing` package. You can have a look at them here: https://scikit-learn.org/stable/modules/classes.html#module-sklearn.preprocessing\n", "\n", "**StandardScaler**: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler\n", "**MinMaxScaler**: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html#sklearn.preprocessing.MinMaxScaler\n", "\n", "Your task is to apply standard scaling to all four features of the IRIS dataset." ] }, { "cell_type": "code", "execution_count": 6, "id": "b424f9ff-1650-4694-9c63-0e548886587a", "metadata": {}, "outputs": [], "source": [ "# TODO Let's create an instance of StandardScaler\n", "scaler = _\n", "# TODO Apply standard scaling to our features\n", "X_scaled = _" ] }, { "cell_type": "code", "execution_count": 7, "id": "f74c3dec-e568-478c-8acd-8b6b857a93f7", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123
0-0.9006811.019004-1.340227-1.315444
1-1.143017-0.131979-1.340227-1.315444
2-1.3853530.328414-1.397064-1.315444
3-1.5065210.098217-1.283389-1.315444
4-1.0218491.249201-1.340227-1.315444
\n", "
" ], "text/plain": [ " 0 1 2 3\n", "0 -0.900681 1.019004 -1.340227 -1.315444\n", "1 -1.143017 -0.131979 -1.340227 -1.315444\n", "2 -1.385353 0.328414 -1.397064 -1.315444\n", "3 -1.506521 0.098217 -1.283389 -1.315444\n", "4 -1.021849 1.249201 -1.340227 -1.315444" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Now let's again have a look at our dataset after it has been scaled\n", "pd.DataFrame(X_scaled).head()" ] }, { "cell_type": "code", "execution_count": 8, "id": "4273153e-0f3e-4b3e-8b1c-e1405ce7a0e8", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 1.0, 'After scaling')" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(1, 2)\n", "ax[0].hist(X.iloc[:, 0])\n", "ax[0].set_title(\"Before scaling\")\n", "ax[1].hist(X_scaled[:, 0])\n", "ax[1].set_title(\"After scaling\")" ] }, { "cell_type": "markdown", "id": "af926567-41a6-44ad-b319-d93a97eba10d", "metadata": {}, "source": [ "The features have been standardizes, that means they have been shifted by the value of their mean to the left and then divided by their own standard deviation. We can see this effect by comparing the mean and the variance of a feature before and after applying the standard scaler." ] }, { "cell_type": "code", "execution_count": 9, "id": "9a6b0ca6-cc37-46f1-b72a-47a31e6d36c6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean before scaling: 5.843333333333334 , Variance before scaling: 0.6856935123042507\n", "Mean after scaling: -4.736951571734001e-16 , Variance after scaling: 1.0\n" ] } ], "source": [ "print(\"Mean before scaling:\", X.iloc[:, 0].mean(), \", Variance before scaling: \", X.iloc[:, 0].var())\n", "print(\"Mean after scaling:\", X_scaled[:, 0].mean(), \", Variance after scaling: \", X_scaled[:, 0].var())" ] }, { "cell_type": "markdown", "id": "2e78307a-7de2-4dca-a4bb-ff71ad05efbb", "metadata": {}, "source": [ "## Building a dataset class and a data loader\n", "\n", "Now that we know how to load and prpepare the IRIS dataset for training it is time to build a dataset class." ] }, { "cell_type": "code", "execution_count": 10, "id": "e394289d-bbf1-4a51-83b0-d2272243bb72", "metadata": {}, "outputs": [], "source": [ "class IRISDataset(Dataset):\n", " \"\"\"\n", " This class loads the IRIS data\n", " \"\"\"\n", " def __init__(self, is_train_dataset=True):\n", " \"\"\"\n", " Initialize the dataset class\n", " :param is_train_dataset: True if this class should use the training data, False otherwise\n", " \"\"\"\n", " # Load the IRIS dataset\n", " iris = load_iris(as_frame=False)\n", " \n", " # TODO Extract features and labels from the dataset\n", " X = _\n", " y = _\n", " \n", " # TODO Apply standard scaling to the features. This makes model training more stable.\n", " scaler = _\n", " X_scaled = _\n", " \n", " # TODO Split the data set into training and testing\n", " X_train, X_test, y_train, y_test = _(X_scaled, y, test_size=0.2, random_state=2)\n", " \n", " # Check whether the training or test data should be loaded\n", " if is_train_dataset:\n", " self.data = X_train\n", " self.labels = y_train\n", " else:\n", " self.data = X_test\n", " self.labels = y_test\n", "\n", " def __len__(self):\n", " \"\"\"\n", " This function returns the total number of items in the dataset.\n", " We are using a numpy array in this dataset which has an attribut named shape.\n", " The first dimension of shape is equal to the number of items in the dataset.\n", " :return: The number of rows in the CSV file\n", " \"\"\"\n", " # TODO return the size of the dataset\n", " return _\n", "\n", " def __getitem__(self, idx):\n", " \"\"\"\n", " This function returns a single tuple from the dataset.\n", " :param idx: The index of the tuple that should be returned.\n", " :return: Tuple of a feature vector and a y-value\n", " \"\"\"\n", " # TODO return a tuple of data points and labels\n", " return _, _" ] }, { "cell_type": "code", "execution_count": 11, "id": "13c29b96-c04e-464f-b396-0c160117a464", "metadata": {}, "outputs": [], "source": [ "# TODO Create two datasets: one for training and another one for testing\n", "dataset_train = _\n", "dataset_test = _" ] }, { "cell_type": "code", "execution_count": 12, "id": "18bb5256-2cd7-48e3-b75a-d4aa3b99602d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train dataset size: 120 , Test dataset size 30\n" ] } ], "source": [ "# Let's check how many items are in the training and the test dataset\n", "print(\"Train dataset size:\", len(dataset_train), \", Test dataset size\", len(dataset_test))" ] }, { "cell_type": "code", "execution_count": 13, "id": "7dec1b3a-e725-4646-9fd4-d84f0aabc41c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(array([ 0.4321654 , -0.59237301, 0.59224599, 0.79067065]), 2)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's sample the first item of the training dataset.\n", "# You will get a tuple consisting of the feature vector and the label of the item.\n", "dataset_train[0]" ] }, { "cell_type": "markdown", "id": "6165da74-d663-49be-9f7b-1cf0d14b56ac", "metadata": {}, "source": [ "In PyTorch you also have to define a data loader for each dataset which is responsible for drawing random samples from the dataset. The **batch size** defines how many random samples should be drawn from the dataset. As we have a training and a test dataset we also need to create one data loader for each of the datasets." ] }, { "cell_type": "code", "execution_count": 14, "id": "fb0a820f-63ce-411d-850c-fc5a56fd84bf", "metadata": {}, "outputs": [], "source": [ "# TODO Create data loaders for the IRIS dataset\n", "dataloader_train = _\n", "dataloader_test = _" ] }, { "cell_type": "code", "execution_count": 15, "id": "1a638a3e-2b8a-4b59-9350-fe226878b59c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[tensor([[ 0.6745, -0.5924, 1.0469, 1.1856],\n", " [ 0.3110, -0.1320, 0.6491, 0.7907],\n", " [-1.2642, 0.7888, -1.0560, -1.3154],\n", " [-1.0218, 0.3284, -1.4539, -1.3154],\n", " [-0.4160, -1.2830, 0.1375, 0.1325],\n", " [ 1.0380, -1.2830, 1.1606, 0.7907],\n", " [-0.9007, 1.7096, -1.0560, -1.0522],\n", " [ 1.6438, 0.3284, 1.2743, 0.7907],\n", " [-1.6277, -1.7434, -1.3971, -1.1838],\n", " [ 1.0380, 0.5586, 1.1038, 1.7121],\n", " [-0.5372, 1.9398, -1.1697, -1.0522],\n", " [ 0.6745, 0.3284, 0.8764, 1.4488],\n", " [-1.2642, -0.1320, -1.3402, -1.4471],\n", " [-1.5065, 0.0982, -1.2834, -1.3154],\n", " [ 0.7957, -0.5924, 0.4786, 0.3958],\n", " [-0.2948, -0.8226, 0.2512, 0.1325],\n", " [-1.3854, 0.3284, -1.2266, -1.3154],\n", " [-0.1737, -0.3622, 0.2512, 0.1325],\n", " [ 0.6745, -0.3622, 0.3081, 0.1325],\n", " [ 1.0380, -0.1320, 0.7059, 0.6590],\n", " [-1.7489, 0.3284, -1.3971, -1.3154],\n", " [ 1.6438, -0.1320, 1.1606, 0.5274],\n", " [-1.2642, -0.1320, -1.3402, -1.1838],\n", " [-0.1737, -1.2830, 0.7059, 1.0539],\n", " [-1.0218, -0.1320, -1.2266, -1.3154],\n", " [ 0.1898, 0.7888, 0.4217, 0.5274],\n", " [-0.6583, 1.4794, -1.2834, -1.3154],\n", " [ 0.5533, -1.7434, 0.3649, 0.1325],\n", " [-1.3854, 0.3284, -1.3971, -1.3154],\n", " [-1.5065, 0.7888, -1.3402, -1.1838]], dtype=torch.float64),\n", " tensor([2, 2, 0, 0, 1, 2, 0, 2, 0, 2, 0, 2, 0, 0, 1, 1, 0, 1, 1, 1, 0, 2, 0, 2,\n", " 0, 1, 0, 1, 0, 0], dtype=torch.int32)]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We can get a random sample from a data loader by wrapping it into next(iter()).\n", "# You will get a tuple that contains a batch of feature vectors and their corresponding labels.\n", "# If you run this code multiple times you always will get another random sample.\n", "next(iter(dataloader_test))" ] }, { "cell_type": "markdown", "id": "4d1513ed-7c7d-4638-91a4-a5843732641c", "metadata": {}, "source": [ "## Building a neural network for classification\n", "\n", "Great! We now know how to load the IRIS dataset, preprocess it, split it into training and test data and build Dataset and DataLoader classes. The next step is to define a neural network that is able to consume our data and predict the species based on the four features of the data." ] }, { "cell_type": "code", "execution_count": 16, "id": "3f185b58-fce3-4837-8de8-cdef4ff8f728", "metadata": {}, "outputs": [], "source": [ "class IRISClassificationNetwork(nn.Module):\n", " def __init__(self):\n", " \"\"\"\n", " Here we define the layers of our neural network.\n", " \"\"\"\n", " super(IRISClassificationNetwork, self).__init__()\n", " # Our data has four features, so the first linear layer has to have four input dimensions.\n", " # TODO add the linear layer\n", " self.layer1 = _\n", " # The first hidden layer need to have the same input dimension as layer1 has outputs. \n", " # TODO add the linear layer\n", " self.layer2 = _\n", " # We have three different classes in out data, so the last linear layer must have 3 output dimensions.\n", " # TODO add the linear layer\n", " self.layer3 = _\n", " # TODO Add a ReLU layer\n", " self.activation = _\n", " # The outputs of the last linear layer need to be mapped to a probability function.\n", " # This can be done by running the vectors through a softmax function.\n", " # TODO add the softmax function\n", " self.classification = _\n", " \n", " def forward(self, x):\n", " \"\"\"\n", " The forward function takes a data vector and runs it through the layers of our neural network.\n", " :return: The forward function returns a vector of size 3 which contains the\n", " probabilities for all three classes for a given data vector.\n", " \"\"\"\n", " # TODO Run the input through the first linear layer and then through the activation function.\n", " x = _\n", " # TODO Run the outputs of layer 1 through layer 2.\n", " x = _\n", " # TODO Run the outputs of layer 2 through the third linear layer and then through the softmax classification function.\n", " x = _\n", " return x" ] }, { "cell_type": "code", "execution_count": 17, "id": "10f4d593-6e16-406c-8408-bb56ec7747f4", "metadata": {}, "outputs": [], "source": [ "# TODO Now that we have defined the network class we need to create an instance of it\n", "net = _" ] }, { "cell_type": "code", "execution_count": 18, "id": "566d562e-811c-470f-9740-8c40874daf3e", "metadata": {}, "outputs": [], "source": [ "def get_accuracy(net, dataloader):\n", " \"\"\"\n", " This function computes the accuracy of the neural network by sampling data from a\n", " data loader, running it through the network and computing the percentage of correct predictions.\n", " :param net: The neural network\n", " :param dataloader: A DataLoader instance\n", " \"\"\"\n", " # torch.no_grad means that no gradients should be computed when running data through the network.\n", " # When we run test data through the network this should not have an effect on our training, that is\n", " # why we don't want to compute gradients here.\n", " with torch.no_grad():\n", " X_test, y_test = next(iter(dataloader))\n", " y_pred = net(X_test.to(torch.float32))\n", " correct = (torch.argmax(y_pred, dim=1) == y_test).type(torch.float32)\n", " return correct.mean().item()" ] }, { "cell_type": "code", "execution_count": 19, "id": "7c238b0a-cae6-4c7a-9561-ab93fb7c8137", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy before training: 0.36666667461395264\n" ] } ], "source": [ "# Let's check the accuracy before training the network\n", "print(\"Accuracy before training:\", get_accuracy(net, dataloader_test))" ] }, { "cell_type": "markdown", "id": "a14de1e1-c98e-4b96-81f2-e1ce6b614d51", "metadata": {}, "source": [ "The accuracy of the untrained network is very bad, because the parameters of the network are initialized randomly. Let's train the network to find the parameters that allow us to make better predictions for the classes of our flowers." ] }, { "cell_type": "markdown", "id": "58df85a8-d098-4a7f-a3a9-807e6fba0ee3", "metadata": {}, "source": [ "## Training the network" ] }, { "cell_type": "code", "execution_count": 20, "id": "d7c16a77-d20f-4375-8071-d6dc1ca44658", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\tilof\\AppData\\Local\\Programs\\Python\\Python310\\lib\\site-packages\\torch\\autograd\\__init__.py:200: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ..\\c10\\cuda\\CUDAFunctions.cpp:109.)\n", " Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch [10/250], Loss: 1.0342, Accuracy on test data: 0.4000000059604645\n", "Epoch [20/250], Loss: 1.0147, Accuracy on test data: 0.800000011920929\n", "Epoch [30/250], Loss: 0.9456, Accuracy on test data: 0.7666666507720947\n", "Epoch [40/250], Loss: 0.8926, Accuracy on test data: 0.800000011920929\n", "Epoch [50/250], Loss: 0.8386, Accuracy on test data: 0.8333333134651184\n", "Epoch [60/250], Loss: 0.8800, Accuracy on test data: 0.8666666746139526\n", "Epoch [70/250], Loss: 0.7504, Accuracy on test data: 0.8666666746139526\n", "Epoch [80/250], Loss: 0.7091, Accuracy on test data: 0.8333333134651184\n", "Epoch [90/250], Loss: 0.6661, Accuracy on test data: 0.8666666746139526\n", "Epoch [100/250], Loss: 0.7385, Accuracy on test data: 0.8666666746139526\n", "Epoch [110/250], Loss: 0.6871, Accuracy on test data: 0.8999999761581421\n", "Epoch [120/250], Loss: 0.7192, Accuracy on test data: 0.8999999761581421\n", "Epoch [130/250], Loss: 0.6934, Accuracy on test data: 0.8999999761581421\n", "Epoch [140/250], Loss: 0.6540, Accuracy on test data: 0.8999999761581421\n", "Epoch [150/250], Loss: 0.6473, Accuracy on test data: 0.8999999761581421\n", "Epoch [160/250], Loss: 0.5812, Accuracy on test data: 0.9333333373069763\n", "Epoch [170/250], Loss: 0.6593, Accuracy on test data: 0.9333333373069763\n", "Epoch [180/250], Loss: 0.6610, Accuracy on test data: 0.9333333373069763\n", "Epoch [190/250], Loss: 0.5954, Accuracy on test data: 0.9333333373069763\n", "Epoch [200/250], Loss: 0.5859, Accuracy on test data: 0.9333333373069763\n", "Epoch [210/250], Loss: 0.6564, Accuracy on test data: 0.9333333373069763\n", "Epoch [220/250], Loss: 0.6279, Accuracy on test data: 0.9333333373069763\n", "Epoch [230/250], Loss: 0.5801, Accuracy on test data: 0.9333333373069763\n", "Epoch [240/250], Loss: 0.6023, Accuracy on test data: 0.9333333373069763\n", "Epoch [250/250], Loss: 0.6225, Accuracy on test data: 0.9333333373069763\n" ] } ], "source": [ "# Here we define how long we want to train the network\n", "num_epochs = 250\n", "# TODO This is our loss function. Which one do we need for classification: MSELoss or CrossEntropyLoss?\n", "criterion = _\n", "# TODO This is the algorithm used for optimizing our neural network parameters.\n", "optimizer = optim.Adam(_, lr=0.001)\n", "\n", "for epoch in range(num_epochs):\n", " # TODO Draw data from the data loader\n", " X, Y = _\n", " \n", " # TODO Forward pass\n", " outputs = _\n", " \n", " # TODO Compute the difference between the true labels and the predicted labels\n", " loss = _\n", "\n", " # TODO First reset the gradients\n", " _\n", " \n", " # TODO Then compute the new gradients\n", " _\n", " \n", " # TODO And finally perform the backpropagation step\n", " _\n", "\n", " # Print some metrics about the learning progress\n", " if (epoch + 1) % 10 == 0:\n", " accuracy = get_accuracy(net, dataloader_test)\n", " print(f\"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}, Accuracy on test data:\", accuracy)" ] }, { "cell_type": "code", "execution_count": 21, "id": "109bbc85-19ed-4d6f-8ee2-8bc88c2b7500", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy after training: 0.9333333373069763\n" ] } ], "source": [ "# Let's check the accuracy after training the network\n", "print(\"Accuracy after training:\", get_accuracy(net, dataloader_test))" ] }, { "cell_type": "markdown", "id": "d0f98318-5f5b-4624-859e-89b1d838a23f", "metadata": {}, "source": [ "The accuracy has improved a lot. It has nearly reached 100% which is very good." ] }, { "cell_type": "markdown", "id": "265a3be6-ec01-4bb1-a77a-27610176cf07", "metadata": {}, "source": [ "## Making predictions\n", "\n", "Now that we have a trained network we can make predictions for data vectors." ] }, { "cell_type": "code", "execution_count": 22, "id": "5e2c019c-88a5-4b92-88a4-e79ad2d9bebf", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This is our data vector: tensor([-1.5065, 0.7888, -1.3402, -1.1838])\n", "And this is the corresponding label: 0\n" ] } ], "source": [ "# First let's sample a data vector from our test dataset.\n", "X, y = dataset_test[0]\n", "# Create torch tensors\n", "X = torch.tensor(X).to(torch.float32)\n", "y = torch.tensor(y)\n", "print(\"This is our data vector:\", X)\n", "print(\"And this is the corresponding label:\", y.item())" ] }, { "cell_type": "code", "execution_count": 23, "id": "1c172493-bd66-44e7-8a4d-b06c889b836f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This is the prediction of the neural network: [[9.9939942e-01 5.9499586e-04 5.6224708e-06]]\n" ] } ], "source": [ "# Run the data vector through the network\n", "with torch.no_grad():\n", " y_pred = net(X.reshape(1, -1))\n", " \n", "# Retransform the prediction to a numpy array\n", "y_pred = y_pred.numpy()\n", "\n", "print(\"This is the prediction of the neural network:\", y_pred)" ] }, { "cell_type": "code", "execution_count": 24, "id": "c3a8f761-4182-4dda-ac70-78338e968d3f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Let's plot the predicted classes with a histogram\n", "possible_classes = [0,1,2]\n", "# We plot the possible classes on the x-axis and the probabilities on the y-axis\n", "plt.bar(possible_classes, y_pred.squeeze())" ] }, { "cell_type": "markdown", "id": "65b51e21-84af-4587-a3f4-bb0500879768", "metadata": {}, "source": [ "As you can see the network predicts the label 0 with a probabiliity of nearly 100%. The other probabilities are so small that the can't even be seen in the bar chart." ] }, { "cell_type": "code", "execution_count": null, "id": "f89c655d-d522-4193-8af5-eef27e6a32a9", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 5 }