Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/main/beginner_source/basics/buildmodel_tutorial.py
Views: 1017
"""1`Learn the Basics <intro.html>`_ ||2`Quickstart <quickstart_tutorial.html>`_ ||3`Tensors <tensorqs_tutorial.html>`_ ||4`Datasets & DataLoaders <data_tutorial.html>`_ ||5`Transforms <transforms_tutorial.html>`_ ||6**Build Model** ||7`Autograd <autogradqs_tutorial.html>`_ ||8`Optimization <optimization_tutorial.html>`_ ||9`Save & Load Model <saveloadrun_tutorial.html>`_1011Build the Neural Network12========================1314Neural networks comprise of layers/modules that perform operations on data.15The `torch.nn <https://pytorch.org/docs/stable/nn.html>`_ namespace provides all the building blocks you need to16build your own neural network. Every module in PyTorch subclasses the `nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`_.17A neural network is a module itself that consists of other modules (layers). This nested structure allows for18building and managing complex architectures easily.1920In the following sections, we'll build a neural network to classify images in the FashionMNIST dataset.2122"""2324import os25import torch26from torch import nn27from torch.utils.data import DataLoader28from torchvision import datasets, transforms293031#############################################32# Get Device for Training33# -----------------------34# We want to be able to train our model on an `accelerator <https://pytorch.org/docs/stable/torch.html#accelerators>`__35# such as CUDA, MPS, MTIA, or XPU. If the current accelerator is available, we will use it. Otherwise, we use the CPU.3637device = torch.accelerator.current_accelerator().type if torch.accelerator.is_available() else "cpu"38print(f"Using {device} device")3940##############################################41# Define the Class42# -------------------------43# We define our neural network by subclassing ``nn.Module``, and44# initialize the neural network layers in ``__init__``. Every ``nn.Module`` subclass implements45# the operations on input data in the ``forward`` method.4647class NeuralNetwork(nn.Module):48def __init__(self):49super().__init__()50self.flatten = nn.Flatten()51self.linear_relu_stack = nn.Sequential(52nn.Linear(28*28, 512),53nn.ReLU(),54nn.Linear(512, 512),55nn.ReLU(),56nn.Linear(512, 10),57)5859def forward(self, x):60x = self.flatten(x)61logits = self.linear_relu_stack(x)62return logits6364##############################################65# We create an instance of ``NeuralNetwork``, and move it to the ``device``, and print66# its structure.6768model = NeuralNetwork().to(device)69print(model)707172##############################################73# To use the model, we pass it the input data. This executes the model's ``forward``,74# along with some `background operations <https://github.com/pytorch/pytorch/blob/270111b7b611d174967ed204776985cefca9c144/torch/nn/modules/module.py#L866>`_.75# Do not call ``model.forward()`` directly!76#77# Calling the model on the input returns a 2-dimensional tensor with dim=0 corresponding to each output of 10 raw predicted values for each class, and dim=1 corresponding to the individual values of each output.78# We get the prediction probabilities by passing it through an instance of the ``nn.Softmax`` module.7980X = torch.rand(1, 28, 28, device=device)81logits = model(X)82pred_probab = nn.Softmax(dim=1)(logits)83y_pred = pred_probab.argmax(1)84print(f"Predicted class: {y_pred}")858687######################################################################88# --------------89#909192##############################################93# Model Layers94# -------------------------95#96# Let's break down the layers in the FashionMNIST model. To illustrate it, we97# will take a sample minibatch of 3 images of size 28x28 and see what happens to it as98# we pass it through the network.99100input_image = torch.rand(3,28,28)101print(input_image.size())102103##################################################104# nn.Flatten105# ^^^^^^^^^^^^^^^^^^^^^^106# We initialize the `nn.Flatten <https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html>`_107# layer to convert each 2D 28x28 image into a contiguous array of 784 pixel values (108# the minibatch dimension (at dim=0) is maintained).109110flatten = nn.Flatten()111flat_image = flatten(input_image)112print(flat_image.size())113114##############################################115# nn.Linear116# ^^^^^^^^^^^^^^^^^^^^^^117# The `linear layer <https://pytorch.org/docs/stable/generated/torch.nn.Linear.html>`_118# is a module that applies a linear transformation on the input using its stored weights and biases.119#120layer1 = nn.Linear(in_features=28*28, out_features=20)121hidden1 = layer1(flat_image)122print(hidden1.size())123124125#################################################126# nn.ReLU127# ^^^^^^^^^^^^^^^^^^^^^^128# Non-linear activations are what create the complex mappings between the model's inputs and outputs.129# They are applied after linear transformations to introduce *nonlinearity*, helping neural networks130# learn a wide variety of phenomena.131#132# In this model, we use `nn.ReLU <https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html>`_ between our133# linear layers, but there's other activations to introduce non-linearity in your model.134135print(f"Before ReLU: {hidden1}\n\n")136hidden1 = nn.ReLU()(hidden1)137print(f"After ReLU: {hidden1}")138139140141#################################################142# nn.Sequential143# ^^^^^^^^^^^^^^^^^^^^^^144# `nn.Sequential <https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html>`_ is an ordered145# container of modules. The data is passed through all the modules in the same order as defined. You can use146# sequential containers to put together a quick network like ``seq_modules``.147148seq_modules = nn.Sequential(149flatten,150layer1,151nn.ReLU(),152nn.Linear(20, 10)153)154input_image = torch.rand(3,28,28)155logits = seq_modules(input_image)156157################################################################158# nn.Softmax159# ^^^^^^^^^^^^^^^^^^^^^^160# The last linear layer of the neural network returns `logits` - raw values in [-\infty, \infty] - which are passed to the161# `nn.Softmax <https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html>`_ module. The logits are scaled to values162# [0, 1] representing the model's predicted probabilities for each class. ``dim`` parameter indicates the dimension along163# which the values must sum to 1.164165softmax = nn.Softmax(dim=1)166pred_probab = softmax(logits)167168169#################################################170# Model Parameters171# -------------------------172# Many layers inside a neural network are *parameterized*, i.e. have associated weights173# and biases that are optimized during training. Subclassing ``nn.Module`` automatically174# tracks all fields defined inside your model object, and makes all parameters175# accessible using your model's ``parameters()`` or ``named_parameters()`` methods.176#177# In this example, we iterate over each parameter, and print its size and a preview of its values.178#179180181print(f"Model structure: {model}\n\n")182183for name, param in model.named_parameters():184print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")185186######################################################################187# --------------188#189190#################################################################191# Further Reading192# -----------------193# - `torch.nn API <https://pytorch.org/docs/stable/nn.html>`_194195196