add pytorch0

00487fbd · GILSON Matthieu · c7c48614 · 00487fbd
Commit 00487fbd authored Apr 13, 2024 by GILSON Matthieu
--- a/autodiff/pytorch0_data.ipynb
+++ b/autodiff/pytorch0_data.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2136dc53-3080-47a5-9995-ea2a1d57f8ec",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bd151e70-c3c0-44c2-80ad-9cf8acc85e0c",
+   "metadata": {},
+   "source": [
+    "This tutorial heavily draws on [https://pytorch.org/tutorials](https://pytorch.org/tutorials); refer to those pages for further detail."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b1d5ee6c-e12d-43e4-b489-fb00ae26f4c6",
+   "metadata": {},
+   "source": [
+    "# Data types: everything is tensor\n",
+    "\n",
+    "Tensors in `torch` are the equivalent of `numpy` arrays. See [https://pytorch.org/docs/stable/torch.html](https://pytorch.org/docs/stable/torch.html) for detail.\n",
+    "\n",
+    "They can be created from python or numpy objects. The data type is automatically inferred, un less stated otherwise."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5deb6c42-fb46-4006-af03-f9a8ce69c2a2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# list\n",
+    "data = [[0, 2],[5, 4]]\n",
+    "\n",
+    "# tensor from list\n",
+    "t_data = torch.tensor(data)\n",
+    "\n",
+    "print('data:\\n{}\\nt_data:\\n{}'.format(data, x_data))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2c2447ab-0c5a-42be-8429-a77841eed101",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print('t_data shape: {}'.format(t_data.shape))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1c26ebfa-fd79-4b13-90f9-00de802b5b82",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print('data type is {}, made of elements of type {}'.format(type(data), type(data[0][0])))\n",
+    "print('t_data type is {}, made of elements of type {}'.format(type(t_data), t_data.dtype))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cae68bf6-05d9-4f03-9762-54c047383c3d",
+   "metadata": {},
+   "source": [
+    "Types can be overridden, for example to save memory space."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2a56b0b6-3f49-4562-99e7-f9417cf37941",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# tensor from list\n",
+    "t_data = torch.tensor(data, dtype=torch.float32)\n",
+    "\n",
+    "print('t_data type is {}, made of elements of type {}'.format(type(t_data), t_data.dtype))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "690ab4f8-e84a-4b9e-9e5b-fae871c69895",
+   "metadata": {},
+   "source": [
+    "Similarly, tensors can be built from `numpy` arrays."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b2413d4a-170e-43e9-a98d-41c706072834",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "x_data = np.array(data, dtype=np.float32)\n",
+    "\n",
+    "t_data = torch.from_numpy(x_data)\n",
+    "\n",
+    "print('x_data type is {}, made of elements of type {}'.format(type(x_data), x_data.dtype))\n",
+    "print('t_data type is {}, made of elements of type {}'.format(type(t_data), t_data.dtype))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6784233c-262c-463f-bbde-f0ae3b136d5e",
+   "metadata": {},
+   "source": [
+    "Careful here: the `torch` tensor and `numpy` array are linked together (they share the same memory), so changing one changes the other."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ffba4a35-3340-4749-98e4-2613326f444b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# add one to all elements\n",
+    "t_data += 1\n",
+    "print('t_data:\\n{}\\nx_data:\\n{}'.format(t_data, x_data))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e45e939c-932f-4bba-b266-151b0f1d3e15",
+   "metadata": {},
+   "source": [
+    "But that is not the case with the list..."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "48708357-fee8-4634-b315-e70a1c03b532",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(data)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a419b5f-202a-49f2-ad93-43bca2b256f3",
+   "metadata": {},
+   "source": [
+    "A particularity of `torch` is to keep track of the device where the object is stored (usually cpu or gpu)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a10375d2-5e99-4403-9b13-f8a18e09b1f3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print('t_data is stored on: {}'.format(t_data.device))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b20bdca8-f70b-4d32-a621-ec0308e45f95",
+   "metadata": {},
+   "source": [
+    "Most usual constructors from `numpy` are available. See also `torch.zeros_like`, `torch.arange`, `torch.linspace`, `torch.eye`, etc."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "39fafc79-b8dc-4e54-8b7e-0822f28d362b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "shape = (2,3,)\n",
+    "\n",
+    "# tensor filled with zeros\n",
+    "t_zeros = torch.zeros(shape)\n",
+    "print('t_zeros: \\n {} \\n'.format(t_zeros))\n",
+    "\n",
+    "# tensor filled with ones\n",
+    "t_ones = torch.ones(shape)\n",
+    "print('t_ones: \\n {} \\n'.format(t_ones))\n",
+    "\n",
+    "# tensor filled with random variables\n",
+    "t_rand = torch.rand(shape)\n",
+    "print('t_rand: \\n {} \\n'.format(t_rand))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "09b88eee-3f0a-4308-993c-17376b039f56",
+   "metadata": {},
+   "source": [
+    "Likewise, many functions from `numpy` are available as member functions, in particular linear algebra from `numpy.linalg`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3fe62515-3a51-431c-9ca4-8036cca8511f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# sum\n",
+    "print('sum of t_rand: \\n{}'.format(t_rand.sum()))\n",
+    "print('or equivalently')\n",
+    "print('{} \\n'.format(torch.sum(t_rand)))\n",
+    "      \n",
+    "# mean\n",
+    "print('mean of t_rand: \\n{}'.format(t_rand.mean(axis=1)))\n",
+    "print('or equivalently')\n",
+    "print('{} \\n'.format(torch.mean(t_rand, axis=1)))\n",
+    "\n",
+    "# std\n",
+    "print('standard deviation of t_rand: \\n{}'.format(t_rand.std(axis=1)))\n",
+    "print('or equivalently')\n",
+    "print('{} \\n'.format(torch.std(t_rand, axis=1)))\n",
+    "\n",
+    "# svd\n",
+    "print('singular value decomposition of t_rand: \\n{}'.format(t_rand.svd()))\n",
+    "print('or equivalently')\n",
+    "print('{} \\n'.format(torch.svd(t_rand)))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d0621d20-40ba-4e2f-97f6-7a5657714101",
+   "metadata": {},
+   "source": [
+    "One has to be careful with the difference between element-wise mulatiplication of arrays (`torch.mul`) and the matrix multiplication (`torch.matmul` or `@`)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7dbe3f1f-d984-49cc-bef5-15a9461591b7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t1 = torch.rand((2,2))\n",
+    "t2 = torch.rand((2,3))\n",
+    "t3 = torch.rand((2,3))\n",
+    "\n",
+    "print('tensors t1, t2 and t3:\\n{}\\n{}\\n{}'.format(t1, t2, t3))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "84684473-62d1-4d4e-b017-1ca5b71ea4ee",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print('matrix multiplcation\\n{}'.format(t1.matmul(t2)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c685cec2-d0c9-4442-9043-d66090732b80",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print('element-wise multiplcation\\n{}'.format(t2.mul(t3)))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e6672cf6-0ad7-409b-8c57-0bb14fe0e48e",
+   "metadata": {},
+   "source": [
+    "But conditions on size apply!!!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "73c6f2aa-d8d7-4fe3-a7df-318e827b4bb2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t2.matmul(t3)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "93acbfd6-037c-4c94-98b0-fc017d758249",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t1.mul(t2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4facf6f6-7f63-4b48-b112-6e5021d3705d",
+   "metadata": {},
+   "source": [
+    "Like in `numpy`, `torch.einsum` is useful to make multiplcation along specific axes in a flexible manner."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2a40d9d3-1c5a-4782-bf8d-934d75a2c8f8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print('multiplication along axis 0 for t1 and 0 for t2:\\n{}'.format(torch.einsum('ij, ik -> jk', t1, t2)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f0addf96-9ebb-4728-9d5f-21cb36c212a8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print('multiplication along axis 1 for t1 and 0 for t2:\\n{}'.format(torch.einsum('ji, ik -> jk', t1, t2)))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "19d265a2-0a7e-4bb2-b462-af37f42e3ce3",
+   "metadata": {},
+   "source": [
+    "# Datasets and loaders\n",
+    "\n",
+    "Data manipulation are eased in pytorch by functions that can load big datasets and select batches of samples with randomization. Many datasets are available, like images with `torchvision`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fee7df60-f5ca-4aa6-bcdc-7cf4394c89b8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from torchvision import datasets\n",
+    "from torchvision.transforms import ToTensor\n",
+    "from torch.utils.data import DataLoader\n",
+    "import matplotlib.pyplot as plt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e7e1a733-ca91-4eb3-8a42-4a422eaed108",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "training_data = datasets.MNIST(\n",
+    "    root='tmp',\n",
+    "    train=True,\n",
+    "    download=True,\n",
+    "    transform=ToTensor()\n",
+    ")\n",
+    "\n",
+    "test_data = datasets.MNIST(\n",
+    "    root='tmp',\n",
+    "    train=False,\n",
+    "    download=True,\n",
+    "    transform=ToTensor()\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6f6db233-d3db-464a-be45-963b06528564",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# plot random example samples\n",
+    "figure = plt.figure(figsize=(8, 8))\n",
+    "cols, rows = 3, 3\n",
+    "for i in range(1, cols * rows + 1):\n",
+    "    sample_idx = torch.randint(len(training_data), size=(1,)).item()\n",
+    "    img, label = training_data[sample_idx]\n",
+    "    figure.add_subplot(rows, cols, i)\n",
+    "    plt.title(label)\n",
+    "    plt.axis('off')\n",
+    "    plt.imshow(img.squeeze(), cmap='gray')\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "91bfa9a2-e189-4d01-b77f-456b0d1d4e9a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# make a batch loader\n",
+    "train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e38b72ee-425a-4e80-b681-74d62ff5a48c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# load new batch\n",
+    "train_features, train_labels = next(iter(train_dataloader))\n",
+    "print('Feature batch shape: {}'.format(train_features.size()))\n",
+    "print('Labels batch shape: {}'.format(train_labels.size()))\n",
+    "\n",
+    "# plot first sample of batch\n",
+    "plt.figure()\n",
+    "plt.title(train_labels[0].numpy())\n",
+    "plt.axis('off')\n",
+    "plt.imshow(train_features[0].squeeze(), cmap='gray')\n",
+    "plt.show()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
+%% Cell type:code id:2136dc53-3080-47a5-9995-ea2a1d57f8ec tags:
+
+``` python
+import torch
+import numpy as np
+```
+
+%% Cell type:markdown id:bd151e70-c3c0-44c2-80ad-9cf8acc85e0c tags:
+
+This tutorial heavily draws on [https://pytorch.org/tutorials](https://pytorch.org/tutorials); refer to those pages for further detail.
+
+%% Cell type:markdown id:b1d5ee6c-e12d-43e4-b489-fb00ae26f4c6 tags:
+
+# Data types: everything is tensor
+
+Tensors in `torch` are the equivalent of `numpy` arrays. See [https://pytorch.org/docs/stable/torch.html](https://pytorch.org/docs/stable/torch.html) for detail.
+
+They can be created from python or numpy objects. The data type is automatically inferred, un less stated otherwise.
+
+%% Cell type:code id:5deb6c42-fb46-4006-af03-f9a8ce69c2a2 tags:
+
+``` python
+# list
+data = [[0, 2],[5, 4]]
+
+# tensor from list
+t_data = torch.tensor(data)
+
+print('data:\n{}\nt_data:\n{}'.format(data, x_data))
+```
+
+%% Cell type:code id:2c2447ab-0c5a-42be-8429-a77841eed101 tags:
+
+``` python
+print('t_data shape: {}'.format(t_data.shape))
+```
+
+%% Cell type:code id:1c26ebfa-fd79-4b13-90f9-00de802b5b82 tags:
+
+``` python
+print('data type is {}, made of elements of type {}'.format(type(data), type(data[0][0])))
+print('t_data type is {}, made of elements of type {}'.format(type(t_data), t_data.dtype))
+```
+
+%% Cell type:markdown id:cae68bf6-05d9-4f03-9762-54c047383c3d tags:
+
+Types can be overridden, for example to save memory space.
+
+%% Cell type:code id:2a56b0b6-3f49-4562-99e7-f9417cf37941 tags:
+
+``` python
+# tensor from list
+t_data = torch.tensor(data, dtype=torch.float32)
+
+print('t_data type is {}, made of elements of type {}'.format(type(t_data), t_data.dtype))
+```
+
+%% Cell type:markdown id:690ab4f8-e84a-4b9e-9e5b-fae871c69895 tags:
+
+Similarly, tensors can be built from `numpy` arrays.
+
+%% Cell type:code id:b2413d4a-170e-43e9-a98d-41c706072834 tags:
+
+``` python
+x_data = np.array(data, dtype=np.float32)
+
+t_data = torch.from_numpy(x_data)
+
+print('x_data type is {}, made of elements of type {}'.format(type(x_data), x_data.dtype))
+print('t_data type is {}, made of elements of type {}'.format(type(t_data), t_data.dtype))
+```
+
+%% Cell type:markdown id:6784233c-262c-463f-bbde-f0ae3b136d5e tags:
+
+Careful here: the `torch` tensor and `numpy` array are linked together (they share the same memory), so changing one changes the other.
+
+%% Cell type:code id:ffba4a35-3340-4749-98e4-2613326f444b tags:
+
+``` python
+# add one to all elements
+t_data += 1
+print('t_data:\n{}\nx_data:\n{}'.format(t_data, x_data))
+```
+
+%% Cell type:markdown id:e45e939c-932f-4bba-b266-151b0f1d3e15 tags:
+
+But that is not the case with the list...
+
+%% Cell type:code id:48708357-fee8-4634-b315-e70a1c03b532 tags:
+
+``` python
+print(data)
+```
+
+%% Cell type:markdown id:3a419b5f-202a-49f2-ad93-43bca2b256f3 tags:
+
+A particularity of `torch` is to keep track of the device where the object is stored (usually cpu or gpu).
+
+%% Cell type:code id:a10375d2-5e99-4403-9b13-f8a18e09b1f3 tags:
+
+``` python
+print('t_data is stored on: {}'.format(t_data.device))
+```
+
+%% Cell type:markdown id:b20bdca8-f70b-4d32-a621-ec0308e45f95 tags:
+
+Most usual constructors from `numpy` are available. See also `torch.zeros_like`, `torch.arange`, `torch.linspace`, `torch.eye`, etc.
+
+%% Cell type:code id:39fafc79-b8dc-4e54-8b7e-0822f28d362b tags:
+
+``` python
+shape = (2,3,)
+
+# tensor filled with zeros
+t_zeros = torch.zeros(shape)
+print('t_zeros: \n {} \n'.format(t_zeros))
+
+# tensor filled with ones
+t_ones = torch.ones(shape)
+print('t_ones: \n {} \n'.format(t_ones))
+
+# tensor filled with random variables
+t_rand = torch.rand(shape)
+print('t_rand: \n {} \n'.format(t_rand))
+```
+
+%% Cell type:markdown id:09b88eee-3f0a-4308-993c-17376b039f56 tags:
+
+Likewise, many functions from `numpy` are available as member functions, in particular linear algebra from `numpy.linalg`.
+
+%% Cell type:code id:3fe62515-3a51-431c-9ca4-8036cca8511f tags:
+
+``` python
+# sum
+print('sum of t_rand: \n{}'.format(t_rand.sum()))
+print('or equivalently')
+print('{} \n'.format(torch.sum(t_rand)))
+
+# mean
+print('mean of t_rand: \n{}'.format(t_rand.mean(axis=1)))
+print('or equivalently')
+print('{} \n'.format(torch.mean(t_rand, axis=1)))
+
+# std
+print('standard deviation of t_rand: \n{}'.format(t_rand.std(axis=1)))
+print('or equivalently')
+print('{} \n'.format(torch.std(t_rand, axis=1)))
+
+# svd
+print('singular value decomposition of t_rand: \n{}'.format(t_rand.svd()))
+print('or equivalently')
+print('{} \n'.format(torch.svd(t_rand)))
+```
+
+%% Cell type:markdown id:d0621d20-40ba-4e2f-97f6-7a5657714101 tags:
+
+One has to be careful with the difference between element-wise mulatiplication of arrays (`torch.mul`) and the matrix multiplication (`torch.matmul` or `@`).
+
+%% Cell type:code id:7dbe3f1f-d984-49cc-bef5-15a9461591b7 tags:
+
+``` python
+t1 = torch.rand((2,2))
+t2 = torch.rand((2,3))
+t3 = torch.rand((2,3))
+
+print('tensors t1, t2 and t3:\n{}\n{}\n{}'.format(t1, t2, t3))
+```
+
+%% Cell type:code id:84684473-62d1-4d4e-b017-1ca5b71ea4ee tags:
+
+``` python
+print('matrix multiplcation\n{}'.format(t1.matmul(t2)))
+```
+
+%% Cell type:code id:c685cec2-d0c9-4442-9043-d66090732b80 tags:
+
+``` python
+print('element-wise multiplcation\n{}'.format(t2.mul(t3)))
+```
+
+%% Cell type:markdown id:e6672cf6-0ad7-409b-8c57-0bb14fe0e48e tags:
+
+But conditions on size apply!!!
+
+%% Cell type:code id:73c6f2aa-d8d7-4fe3-a7df-318e827b4bb2 tags:
+
+``` python
+t2.matmul(t3)
+```
+
+%% Cell type:code id:93acbfd6-037c-4c94-98b0-fc017d758249 tags:
+
+``` python
+t1.mul(t2)
+```
+
+%% Cell type:markdown id:4facf6f6-7f63-4b48-b112-6e5021d3705d tags:
+
+Like in `numpy`, `torch.einsum` is useful to make multiplcation along specific axes in a flexible manner.
+
+%% Cell type:code id:2a40d9d3-1c5a-4782-bf8d-934d75a2c8f8 tags:
+
+``` python
+print('multiplication along axis 0 for t1 and 0 for t2:\n{}'.format(torch.einsum('ij, ik -> jk', t1, t2)))
+```
+
+%% Cell type:code id:f0addf96-9ebb-4728-9d5f-21cb36c212a8 tags:
+
+``` python
+print('multiplication along axis 1 for t1 and 0 for t2:\n{}'.format(torch.einsum('ji, ik -> jk', t1, t2)))
+```
+
+%% Cell type:markdown id:19d265a2-0a7e-4bb2-b462-af37f42e3ce3 tags:
+
+# Datasets and loaders
+
+Data manipulation are eased in pytorch by functions that can load big datasets and select batches of samples with randomization. Many datasets are available, like images with `torchvision`.
+
+%% Cell type:code id:fee7df60-f5ca-4aa6-bcdc-7cf4394c89b8 tags:
+
+``` python
+from torchvision import datasets
+from torchvision.transforms import ToTensor
+from torch.utils.data import DataLoader
+import matplotlib.pyplot as plt
+```
+
+%% Cell type:code id:e7e1a733-ca91-4eb3-8a42-4a422eaed108 tags:
+
+``` python
+training_data = datasets.MNIST(
+    root='tmp',
+    train=True,
+    download=True,
+    transform=ToTensor()
+)
+
+test_data = datasets.MNIST(
+    root='tmp',
+    train=False,
+    download=True,
+    transform=ToTensor()
+)
+```
+
+%% Cell type:code id:6f6db233-d3db-464a-be45-963b06528564 tags:
+
+``` python
+# plot random example samples
+figure = plt.figure(figsize=(8, 8))
+cols, rows = 3, 3
+for i in range(1, cols * rows + 1):
+    sample_idx = torch.randint(len(training_data), size=(1,)).item()
+    img, label = training_data[sample_idx]
+    figure.add_subplot(rows, cols, i)
+    plt.title(label)
+    plt.axis('off')
+    plt.imshow(img.squeeze(), cmap='gray')
+plt.show()
+```
+
+%% Cell type:code id:91bfa9a2-e189-4d01-b77f-456b0d1d4e9a tags:
+
+``` python
+# make a batch loader
+train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
+```
+
+%% Cell type:code id:e38b72ee-425a-4e80-b681-74d62ff5a48c tags:
+
+``` python
+# load new batch
+train_features, train_labels = next(iter(train_dataloader))
+print('Feature batch shape: {}'.format(train_features.size()))
+print('Labels batch shape: {}'.format(train_labels.size()))
+
+# plot first sample of batch
+plt.figure()
+plt.title(train_labels[0].numpy())
+plt.axis('off')
+plt.imshow(train_features[0].squeeze(), cmap='gray')
+plt.show()
+```