Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Path: blob/main/beginner_source/basics/buildmodel_tutorial.py
Views: 713
"""1`Learn the Basics <intro.html>`_ ||2`Quickstart <quickstart_tutorial.html>`_ ||3`Tensors <tensorqs_tutorial.html>`_ ||4`Datasets & DataLoaders <data_tutorial.html>`_ ||5`Transforms <transforms_tutorial.html>`_ ||6**Build Model** ||7`Autograd <autogradqs_tutorial.html>`_ ||8`Optimization <optimization_tutorial.html>`_ ||9`Save & Load Model <saveloadrun_tutorial.html>`_1011Build the Neural Network12========================1314Neural networks comprise of layers/modules that perform operations on data.15The `torch.nn <https://pytorch.org/docs/stable/nn.html>`_ namespace provides all the building blocks you need to16build your own neural network. Every module in PyTorch subclasses the `nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`_.17A neural network is a module itself that consists of other modules (layers). This nested structure allows for18building and managing complex architectures easily.1920In the following sections, we'll build a neural network to classify images in the FashionMNIST dataset.2122"""2324import os25import torch26from torch import nn27from torch.utils.data import DataLoader28from torchvision import datasets, transforms293031#############################################32# Get Device for Training33# -----------------------34# We want to be able to train our model on a hardware accelerator like the GPU or MPS,35# if available. Let's check to see if `torch.cuda <https://pytorch.org/docs/stable/notes/cuda.html>`_36# or `torch.backends.mps <https://pytorch.org/docs/stable/notes/mps.html>`_ are available, otherwise we use the CPU.3738device = (39"cuda"40if torch.cuda.is_available()41else "mps"42if torch.backends.mps.is_available()43else "cpu"44)45print(f"Using {device} device")4647##############################################48# Define the Class49# -------------------------50# We define our neural network by subclassing ``nn.Module``, and51# initialize the neural network layers in ``__init__``. Every ``nn.Module`` subclass implements52# the operations on input data in the ``forward`` method.5354class NeuralNetwork(nn.Module):55def __init__(self):56super().__init__()57self.flatten = nn.Flatten()58self.linear_relu_stack = nn.Sequential(59nn.Linear(28*28, 512),60nn.ReLU(),61nn.Linear(512, 512),62nn.ReLU(),63nn.Linear(512, 10),64)6566def forward(self, x):67x = self.flatten(x)68logits = self.linear_relu_stack(x)69return logits7071##############################################72# We create an instance of ``NeuralNetwork``, and move it to the ``device``, and print73# its structure.7475model = NeuralNetwork().to(device)76print(model)777879##############################################80# To use the model, we pass it the input data. This executes the model's ``forward``,81# along with some `background operations <https://github.com/pytorch/pytorch/blob/270111b7b611d174967ed204776985cefca9c144/torch/nn/modules/module.py#L866>`_.82# Do not call ``model.forward()`` directly!83#84# Calling the model on the input returns a 2-dimensional tensor with dim=0 corresponding to each output of 10 raw predicted values for each class, and dim=1 corresponding to the individual values of each output.85# We get the prediction probabilities by passing it through an instance of the ``nn.Softmax`` module.8687X = torch.rand(1, 28, 28, device=device)88logits = model(X)89pred_probab = nn.Softmax(dim=1)(logits)90y_pred = pred_probab.argmax(1)91print(f"Predicted class: {y_pred}")929394######################################################################95# --------------96#979899##############################################100# Model Layers101# -------------------------102#103# Let's break down the layers in the FashionMNIST model. To illustrate it, we104# will take a sample minibatch of 3 images of size 28x28 and see what happens to it as105# we pass it through the network.106107input_image = torch.rand(3,28,28)108print(input_image.size())109110##################################################111# nn.Flatten112# ^^^^^^^^^^^^^^^^^^^^^^113# We initialize the `nn.Flatten <https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html>`_114# layer to convert each 2D 28x28 image into a contiguous array of 784 pixel values (115# the minibatch dimension (at dim=0) is maintained).116117flatten = nn.Flatten()118flat_image = flatten(input_image)119print(flat_image.size())120121##############################################122# nn.Linear123# ^^^^^^^^^^^^^^^^^^^^^^124# The `linear layer <https://pytorch.org/docs/stable/generated/torch.nn.Linear.html>`_125# is a module that applies a linear transformation on the input using its stored weights and biases.126#127layer1 = nn.Linear(in_features=28*28, out_features=20)128hidden1 = layer1(flat_image)129print(hidden1.size())130131132#################################################133# nn.ReLU134# ^^^^^^^^^^^^^^^^^^^^^^135# Non-linear activations are what create the complex mappings between the model's inputs and outputs.136# They are applied after linear transformations to introduce *nonlinearity*, helping neural networks137# learn a wide variety of phenomena.138#139# In this model, we use `nn.ReLU <https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html>`_ between our140# linear layers, but there's other activations to introduce non-linearity in your model.141142print(f"Before ReLU: {hidden1}\n\n")143hidden1 = nn.ReLU()(hidden1)144print(f"After ReLU: {hidden1}")145146147148#################################################149# nn.Sequential150# ^^^^^^^^^^^^^^^^^^^^^^151# `nn.Sequential <https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html>`_ is an ordered152# container of modules. The data is passed through all the modules in the same order as defined. You can use153# sequential containers to put together a quick network like ``seq_modules``.154155seq_modules = nn.Sequential(156flatten,157layer1,158nn.ReLU(),159nn.Linear(20, 10)160)161input_image = torch.rand(3,28,28)162logits = seq_modules(input_image)163164################################################################165# nn.Softmax166# ^^^^^^^^^^^^^^^^^^^^^^167# The last linear layer of the neural network returns `logits` - raw values in [-\infty, \infty] - which are passed to the168# `nn.Softmax <https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html>`_ module. The logits are scaled to values169# [0, 1] representing the model's predicted probabilities for each class. ``dim`` parameter indicates the dimension along170# which the values must sum to 1.171172softmax = nn.Softmax(dim=1)173pred_probab = softmax(logits)174175176#################################################177# Model Parameters178# -------------------------179# Many layers inside a neural network are *parameterized*, i.e. have associated weights180# and biases that are optimized during training. Subclassing ``nn.Module`` automatically181# tracks all fields defined inside your model object, and makes all parameters182# accessible using your model's ``parameters()`` or ``named_parameters()`` methods.183#184# In this example, we iterate over each parameter, and print its size and a preview of its values.185#186187188print(f"Model structure: {model}\n\n")189190for name, param in model.named_parameters():191print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")192193######################################################################194# --------------195#196197#################################################################198# Further Reading199# -----------------200# - `torch.nn API <https://pytorch.org/docs/stable/nn.html>`_201202203