CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
Path: blob/main/recipes_source/recipes/save_load_across_devices.py
Views: 494
"""1Saving and loading models across devices in PyTorch2===================================================34There may be instances where you want to save and load your neural5networks across different devices.67Introduction8------------910Saving and loading models across devices is relatively straightforward11using PyTorch. In this recipe, we will experiment with saving and12loading models across CPUs and GPUs.1314Setup15-----1617In order for every code block to run properly in this recipe, you must18first change the runtime to “GPU” or higher. Once you do, we need to19install ``torch`` if it isn’t already available.2021.. code-block:: sh2223pip install torch2425"""2627######################################################################28# Steps29# -----30#31# 1. Import all necessary libraries for loading our data32# 2. Define and initialize the neural network33# 3. Save on a GPU, load on a CPU34# 4. Save on a GPU, load on a GPU35# 5. Save on a CPU, load on a GPU36# 6. Saving and loading ``DataParallel`` models37#38# 1. Import necessary libraries for loading our data39# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~40#41# For this recipe, we will use ``torch`` and its subsidiaries ``torch.nn``42# and ``torch.optim``.43#4445import torch46import torch.nn as nn47import torch.optim as optim484950######################################################################51# 2. Define and initialize the neural network52# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~53#54# For sake of example, we will create a neural network for training55# images. To learn more see the Defining a Neural Network recipe.56#5758class Net(nn.Module):59def __init__(self):60super(Net, self).__init__()61self.conv1 = nn.Conv2d(3, 6, 5)62self.pool = nn.MaxPool2d(2, 2)63self.conv2 = nn.Conv2d(6, 16, 5)64self.fc1 = nn.Linear(16 * 5 * 5, 120)65self.fc2 = nn.Linear(120, 84)66self.fc3 = nn.Linear(84, 10)6768def forward(self, x):69x = self.pool(F.relu(self.conv1(x)))70x = self.pool(F.relu(self.conv2(x)))71x = x.view(-1, 16 * 5 * 5)72x = F.relu(self.fc1(x))73x = F.relu(self.fc2(x))74x = self.fc3(x)75return x7677net = Net()78print(net)798081######################################################################82# 3. Save on GPU, Load on CPU83# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~84#85# When loading a model on a CPU that was trained with a GPU, pass86# ``torch.device('cpu')`` to the ``map_location`` argument in the87# ``torch.load()`` function.88#8990# Specify a path to save to91PATH = "model.pt"9293# Save94torch.save(net.state_dict(), PATH)9596# Load97device = torch.device('cpu')98model = Net()99model.load_state_dict(torch.load(PATH, map_location=device, weights_only=True))100101102######################################################################103# In this case, the storages underlying the tensors are dynamically104# remapped to the CPU device using the ``map_location`` argument.105#106# 4. Save on GPU, Load on GPU107# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~108#109# When loading a model on a GPU that was trained and saved on GPU, simply110# convert the initialized model to a CUDA optimized model using111# ``model.to(torch.device('cuda'))``.112#113# Be sure to use the ``.to(torch.device('cuda'))`` function on all model114# inputs to prepare the data for the model.115#116117# Save118torch.save(net.state_dict(), PATH)119120# Load121device = torch.device("cuda")122model = Net()123model.load_state_dict(torch.load(PATH))124model.to(device)125126127######################################################################128# Note that calling ``my_tensor.to(device)`` returns a new copy of129# ``my_tensor`` on GPU. It does NOT overwrite ``my_tensor``. Therefore,130# remember to manually overwrite tensors:131# ``my_tensor = my_tensor.to(torch.device('cuda'))``.132#133# 5. Save on CPU, Load on GPU134# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~135#136# When loading a model on a GPU that was trained and saved on CPU, set the137# ``map_location`` argument in the ``torch.load()`` function to138# ``cuda:device_id``. This loads the model to a given GPU device.139#140# Be sure to call ``model.to(torch.device('cuda'))`` to convert the141# model’s parameter tensors to CUDA tensors.142#143# Finally, also be sure to use the ``.to(torch.device('cuda'))`` function144# on all model inputs to prepare the data for the CUDA optimized model.145#146147# Save148torch.save(net.state_dict(), PATH)149150# Load151device = torch.device("cuda")152model = Net()153# Choose whatever GPU device number you want154model.load_state_dict(torch.load(PATH, map_location="cuda:0"))155# Make sure to call input = input.to(device) on any input tensors that you feed to the model156model.to(device)157158159######################################################################160# 6. Saving ``torch.nn.DataParallel`` Models161# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~162#163# ``torch.nn.DataParallel`` is a model wrapper that enables parallel GPU164# utilization.165#166# To save a ``DataParallel`` model generically, save the167# ``model.module.state_dict()``. This way, you have the flexibility to168# load the model any way you want to any device you want.169#170171# Save172torch.save(net.module.state_dict(), PATH)173174# Load to whatever device you want175176177######################################################################178# Congratulations! You have successfully saved and loaded models across179# devices in PyTorch.180#181182183