Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
keras-team
GitHub Repository: keras-team/keras-io
Path: blob/master/examples/keras_recipes/reproducibility_recipes.py
3507 views
1
"""
2
Title: Reproducibility in Keras Models
3
Author: [Frightera](https://github.com/Frightera)
4
Date created: 2023/05/05
5
Last modified: 2023/05/05
6
Description: Demonstration of random weight initialization and reproducibility in Keras models.
7
Accelerator: GPU
8
"""
9
10
"""
11
## Introduction
12
13
This example demonstrates how to control randomness in Keras models. Sometimes
14
you may want to reproduce the exact same results across runs, for experimentation
15
purposes or to debug a problem.
16
"""
17
18
"""
19
## Setup
20
"""
21
import json
22
import numpy as np
23
import tensorflow as tf
24
import keras
25
from keras import layers
26
from keras import initializers
27
28
# Set the seed using keras.utils.set_random_seed. This will set:
29
# 1) `numpy` seed
30
# 2) backend random seed
31
# 3) `python` random seed
32
keras.utils.set_random_seed(812)
33
34
# If using TensorFlow, this will make GPU ops as deterministic as possible,
35
# but it will affect the overall performance, so be mindful of that.
36
tf.config.experimental.enable_op_determinism()
37
38
39
"""
40
## Weight initialization in Keras
41
42
Most of the layers in Keras have `kernel_initializer` and `bias_initializer`
43
parameters. These parameters allow you to specify the strategy used for
44
initializing the weights of layer variables. The following built-in initializers
45
are available as part of `keras.initializers`:
46
"""
47
48
initializers_list = [
49
initializers.RandomNormal,
50
initializers.RandomUniform,
51
initializers.TruncatedNormal,
52
initializers.VarianceScaling,
53
initializers.GlorotNormal,
54
initializers.GlorotUniform,
55
initializers.HeNormal,
56
initializers.HeUniform,
57
initializers.LecunNormal,
58
initializers.LecunUniform,
59
initializers.Orthogonal,
60
]
61
62
"""
63
In a reproducible model, the weights of the model should be initialized with
64
same values in subsequent runs. First, we'll check how initializers behave when
65
they are called multiple times with same `seed` value.
66
"""
67
68
for initializer in initializers_list:
69
print(f"Running {initializer}")
70
71
for iteration in range(2):
72
# In order to get same results across multiple runs from an initializer,
73
# you can specify a seed value.
74
result = float(initializer(seed=42)(shape=(1, 1)))
75
print(f"\tIteration --> {iteration} // Result --> {result}")
76
print("\n")
77
78
79
"""
80
Now, let's inspect how two different initializer objects behave when they are
81
have the same seed value.
82
"""
83
84
# Setting the seed value for an initializer will cause two different objects
85
# to produce same results.
86
glorot_normal_1 = keras.initializers.GlorotNormal(seed=42)
87
glorot_normal_2 = keras.initializers.GlorotNormal(seed=42)
88
89
input_dim, neurons = 3, 5
90
91
# Call two different objects with same shape
92
result_1 = glorot_normal_1(shape=(input_dim, neurons))
93
result_2 = glorot_normal_2(shape=(input_dim, neurons))
94
95
# Check if the results are equal.
96
equal = np.allclose(result_1, result_2)
97
print(f"Are the results equal? {equal}")
98
99
"""
100
If the seed value is not set (or different seed values are used), two different
101
objects will produce different results. Since the random seed is set at the beginning
102
of the notebook, the results will be same in the sequential runs. This is related
103
to the `keras.utils.set_random_seed`.
104
"""
105
106
glorot_normal_3 = keras.initializers.GlorotNormal()
107
glorot_normal_4 = keras.initializers.GlorotNormal()
108
109
# Let's call the initializer.
110
result_3 = glorot_normal_3(shape=(input_dim, neurons))
111
112
# Call the second initializer.
113
result_4 = glorot_normal_4(shape=(input_dim, neurons))
114
115
equal = np.allclose(result_3, result_4)
116
print(f"Are the results equal? {equal}")
117
118
"""
119
`result_3` and `result_4` will be different, but when you run the notebook
120
again, `result_3` will have identical values to the ones in the previous run.
121
Same goes for `result_4`.
122
"""
123
124
"""
125
## Reproducibility in model training process
126
If you want to reproduce the results of a model training process, you need to
127
control the randomness sources during the training process. In order to show a
128
realistic example, this section utilizes `tf.data` using parallel map and shuffle
129
operations.
130
131
In order to start, let's create a simple function which returns the history
132
object of the Keras model.
133
"""
134
135
136
def train_model(train_data: tf.data.Dataset, test_data: tf.data.Dataset) -> dict:
137
model = keras.Sequential(
138
[
139
layers.Conv2D(32, (3, 3), activation="relu"),
140
layers.MaxPooling2D((2, 2)),
141
layers.Dropout(0.2),
142
layers.Conv2D(32, (3, 3), activation="relu"),
143
layers.MaxPooling2D((2, 2)),
144
layers.Dropout(0.2),
145
layers.Conv2D(32, (3, 3), activation="relu"),
146
layers.GlobalAveragePooling2D(),
147
layers.Dense(64, activation="relu"),
148
layers.Dropout(0.2),
149
layers.Dense(10, activation="softmax"),
150
]
151
)
152
153
model.compile(
154
optimizer="adam",
155
loss="sparse_categorical_crossentropy",
156
metrics=["accuracy"],
157
jit_compile=False,
158
)
159
# jit_compile's default value is "auto" which will cause some problems in some
160
# ops, therefore it's set to False.
161
162
# model.fit has a `shuffle` parameter which has a default value of `True`.
163
# If you are using array-like objects, this will shuffle the data before
164
# training. This argument is ignored when `x` is a generator or
165
# `tf.data.Dataset`.
166
history = model.fit(train_data, epochs=2, validation_data=test_data)
167
168
print(f"Model accuracy on test data: {model.evaluate(test_data)[1] * 100:.2f}%")
169
170
return history.history
171
172
173
# Load the MNIST dataset
174
(train_images, train_labels), (
175
test_images,
176
test_labels,
177
) = keras.datasets.mnist.load_data()
178
179
# Construct tf.data.Dataset objects
180
train_ds = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
181
test_ds = tf.data.Dataset.from_tensor_slices((test_images, test_labels))
182
183
"""
184
Remember we called `tf.config.experimental.enable_op_determinism()` at the
185
beginning of the function. This makes the `tf.data` operations deterministic.
186
However, making `tf.data` operations deterministic comes with a performance
187
cost. If you want to learn more about it, please check this
188
[official guide](https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_op_determinism#determinism_and_tfdata).
189
190
Small summary what's going on here. Models have `kernel_initializer` and
191
`bias_initializer` parameters. Since we set random seeds using
192
`keras.utils.set_random_seed` in the beginning of the notebook, the initializers
193
will produce same results in the sequential runs. Additionally, TensorFlow
194
operations have now become deterministic. Frequently, you will be utilizing GPUs
195
that have thousands of hardware threads which causes non-deterministic behavior
196
to occur.
197
"""
198
199
200
def prepare_dataset(image, label):
201
# Cast and normalize the image
202
image = tf.cast(image, tf.float32) / 255.0
203
204
# Expand the channel dimension
205
image = tf.expand_dims(image, axis=-1)
206
207
# Resize the image
208
image = tf.image.resize(image, (32, 32))
209
210
return image, label
211
212
213
"""
214
`tf.data.Dataset` objects have a `shuffle` method which shuffles the data.
215
This method has a `buffer_size` parameter which controls the size of the
216
buffer. If you set this value to `len(train_images)`, the whole dataset will
217
be shuffled. If the buffer size is equal to the length of the dataset,
218
then the elements will be shuffled in a completely random order.
219
220
Main drawback of setting the buffer size to the length of the dataset is that
221
filling the buffer can take a while depending on the size of the dataset.
222
223
Here is a small summary of what's going on here:
224
1) The `shuffle()` method creates a buffer of the specified size.
225
2) The elements of the dataset are randomly shuffled and placed into the buffer.
226
3) The elements of the buffer are then returned in a random order.
227
228
Since `tf.config.experimental.enable_op_determinism()` is enabled and we set
229
random seeds using `keras.utils.set_random_seed` in the beginning of the
230
notebook, the `shuffle()` method will produce same results in the sequential
231
runs.
232
"""
233
# Prepare the datasets, batch-map --> vectorized operations
234
train_data = (
235
train_ds.shuffle(buffer_size=len(train_images))
236
.batch(batch_size=64)
237
.map(prepare_dataset, num_parallel_calls=tf.data.AUTOTUNE)
238
.prefetch(buffer_size=tf.data.AUTOTUNE)
239
)
240
241
test_data = (
242
test_ds.batch(batch_size=64)
243
.map(prepare_dataset, num_parallel_calls=tf.data.AUTOTUNE)
244
.prefetch(buffer_size=tf.data.AUTOTUNE)
245
)
246
247
"""
248
Train the model for the first time.
249
"""
250
251
history = train_model(train_data, test_data)
252
253
"""
254
Let's save our results into a JSON file, and restart the kernel. After
255
restarting the kernel, we should see the same results as the previous run,
256
this includes metrics and loss values both on the training and test data.
257
"""
258
259
# Save the history object into a json file
260
with open("history.json", "w") as fp:
261
json.dump(history, fp)
262
263
"""
264
Do not run the cell above in order not to overwrite the results. Execute the
265
model training cell again and compare the results.
266
"""
267
268
with open("history.json", "r") as fp:
269
history_loaded = json.load(fp)
270
271
272
"""
273
Compare the results one by one. You will see that they are equal.
274
"""
275
for key in history.keys():
276
for i in range(len(history[key])):
277
if not np.allclose(history[key][i], history_loaded[key][i]):
278
print(f"{key} not equal")
279
280
"""
281
## Conclusion
282
283
In this tutorial, you learned how to control the randomness sources in Keras and
284
TensorFlow. You also learned how to reproduce the results of a model training
285
process.
286
287
If you want to initialize the model with the same weights everytime, you need to
288
set `kernel_initializer` and `bias_initializer` parameters of the layers and provide
289
a `seed` value to the initializer.
290
291
There still may be some inconsistencies due to numerical error accumulation such
292
as using `recurrent_dropout` in RNN layers.
293
294
Reproducibility is subject to the environment. You'll get the same results if you
295
run the notebook or the code on the same machine with the same environment.
296
"""
297
298