CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
pytorch

CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!

GitHub Repository: pytorch/tutorials
Path: blob/main/recipes_source/recipes/save_load_across_devices.py
Views: 494
1
"""
2
Saving and loading models across devices in PyTorch
3
===================================================
4
5
There may be instances where you want to save and load your neural
6
networks across different devices.
7
8
Introduction
9
------------
10
11
Saving and loading models across devices is relatively straightforward
12
using PyTorch. In this recipe, we will experiment with saving and
13
loading models across CPUs and GPUs.
14
15
Setup
16
-----
17
18
In order for every code block to run properly in this recipe, you must
19
first change the runtime to “GPU” or higher. Once you do, we need to
20
install ``torch`` if it isn’t already available.
21
22
.. code-block:: sh
23
24
pip install torch
25
26
"""
27
28
######################################################################
29
# Steps
30
# -----
31
#
32
# 1. Import all necessary libraries for loading our data
33
# 2. Define and initialize the neural network
34
# 3. Save on a GPU, load on a CPU
35
# 4. Save on a GPU, load on a GPU
36
# 5. Save on a CPU, load on a GPU
37
# 6. Saving and loading ``DataParallel`` models
38
#
39
# 1. Import necessary libraries for loading our data
40
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41
#
42
# For this recipe, we will use ``torch`` and its subsidiaries ``torch.nn``
43
# and ``torch.optim``.
44
#
45
46
import torch
47
import torch.nn as nn
48
import torch.optim as optim
49
50
51
######################################################################
52
# 2. Define and initialize the neural network
53
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54
#
55
# For sake of example, we will create a neural network for training
56
# images. To learn more see the Defining a Neural Network recipe.
57
#
58
59
class Net(nn.Module):
60
def __init__(self):
61
super(Net, self).__init__()
62
self.conv1 = nn.Conv2d(3, 6, 5)
63
self.pool = nn.MaxPool2d(2, 2)
64
self.conv2 = nn.Conv2d(6, 16, 5)
65
self.fc1 = nn.Linear(16 * 5 * 5, 120)
66
self.fc2 = nn.Linear(120, 84)
67
self.fc3 = nn.Linear(84, 10)
68
69
def forward(self, x):
70
x = self.pool(F.relu(self.conv1(x)))
71
x = self.pool(F.relu(self.conv2(x)))
72
x = x.view(-1, 16 * 5 * 5)
73
x = F.relu(self.fc1(x))
74
x = F.relu(self.fc2(x))
75
x = self.fc3(x)
76
return x
77
78
net = Net()
79
print(net)
80
81
82
######################################################################
83
# 3. Save on GPU, Load on CPU
84
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
85
#
86
# When loading a model on a CPU that was trained with a GPU, pass
87
# ``torch.device('cpu')`` to the ``map_location`` argument in the
88
# ``torch.load()`` function.
89
#
90
91
# Specify a path to save to
92
PATH = "model.pt"
93
94
# Save
95
torch.save(net.state_dict(), PATH)
96
97
# Load
98
device = torch.device('cpu')
99
model = Net()
100
model.load_state_dict(torch.load(PATH, map_location=device, weights_only=True))
101
102
103
######################################################################
104
# In this case, the storages underlying the tensors are dynamically
105
# remapped to the CPU device using the ``map_location`` argument.
106
#
107
# 4. Save on GPU, Load on GPU
108
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
109
#
110
# When loading a model on a GPU that was trained and saved on GPU, simply
111
# convert the initialized model to a CUDA optimized model using
112
# ``model.to(torch.device('cuda'))``.
113
#
114
# Be sure to use the ``.to(torch.device('cuda'))`` function on all model
115
# inputs to prepare the data for the model.
116
#
117
118
# Save
119
torch.save(net.state_dict(), PATH)
120
121
# Load
122
device = torch.device("cuda")
123
model = Net()
124
model.load_state_dict(torch.load(PATH))
125
model.to(device)
126
127
128
######################################################################
129
# Note that calling ``my_tensor.to(device)`` returns a new copy of
130
# ``my_tensor`` on GPU. It does NOT overwrite ``my_tensor``. Therefore,
131
# remember to manually overwrite tensors:
132
# ``my_tensor = my_tensor.to(torch.device('cuda'))``.
133
#
134
# 5. Save on CPU, Load on GPU
135
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
136
#
137
# When loading a model on a GPU that was trained and saved on CPU, set the
138
# ``map_location`` argument in the ``torch.load()`` function to
139
# ``cuda:device_id``. This loads the model to a given GPU device.
140
#
141
# Be sure to call ``model.to(torch.device('cuda'))`` to convert the
142
# model’s parameter tensors to CUDA tensors.
143
#
144
# Finally, also be sure to use the ``.to(torch.device('cuda'))`` function
145
# on all model inputs to prepare the data for the CUDA optimized model.
146
#
147
148
# Save
149
torch.save(net.state_dict(), PATH)
150
151
# Load
152
device = torch.device("cuda")
153
model = Net()
154
# Choose whatever GPU device number you want
155
model.load_state_dict(torch.load(PATH, map_location="cuda:0"))
156
# Make sure to call input = input.to(device) on any input tensors that you feed to the model
157
model.to(device)
158
159
160
######################################################################
161
# 6. Saving ``torch.nn.DataParallel`` Models
162
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
163
#
164
# ``torch.nn.DataParallel`` is a model wrapper that enables parallel GPU
165
# utilization.
166
#
167
# To save a ``DataParallel`` model generically, save the
168
# ``model.module.state_dict()``. This way, you have the flexibility to
169
# load the model any way you want to any device you want.
170
#
171
172
# Save
173
torch.save(net.module.state_dict(), PATH)
174
175
# Load to whatever device you want
176
177
178
######################################################################
179
# Congratulations! You have successfully saved and loaded models across
180
# devices in PyTorch.
181
#
182
183