CoCalc -- reproducibility

GitHub Repository: keras-team/keras-io
Path: blob/master/examples/keras_recipes/reproducibility_recipes.py
³⁵⁰⁷ views
1
"""
2
Title: Reproducibility in Keras Models
3
Author: [Frightera](https://github.com/Frightera)
4
Date created: 2023/05/05
5
Last modified: 2023/05/05
6
Description: Demonstration of random weight initialization and reproducibility in Keras models.
7
Accelerator: GPU
8
"""
9

10
"""
11
## Introduction
12

13
This example demonstrates how to control randomness in Keras models. Sometimes
14
you may want to reproduce the exact same results across runs, for experimentation
15
purposes or to debug a problem.
16
"""
17

18
"""
19
## Setup
20
"""
21
import json
22
import numpy as np
23
import tensorflow as tf
24
import keras
25
from keras import layers
26
from keras import initializers
27

28
# Set the seed using keras.utils.set_random_seed. This will set:
29
# 1) `numpy` seed
30
# 2) backend random seed
31
# 3) `python` random seed
32
keras.utils.set_random_seed(812)
33

34
# If using TensorFlow, this will make GPU ops as deterministic as possible,
35
# but it will affect the overall performance, so be mindful of that.
36
tf.config.experimental.enable_op_determinism()
37

38

39
"""
40
## Weight initialization in Keras
41

42
Most of the layers in Keras have `kernel_initializer` and `bias_initializer`
43
parameters. These parameters allow you to specify the strategy used for
44
initializing the weights of layer variables. The following built-in initializers
45
are available as part of `keras.initializers`:
46
"""
47

48
initializers_list = [
49
    initializers.RandomNormal,
50
    initializers.RandomUniform,
51
    initializers.TruncatedNormal,
52
    initializers.VarianceScaling,
53
    initializers.GlorotNormal,
54
    initializers.GlorotUniform,
55
    initializers.HeNormal,
56
    initializers.HeUniform,
57
    initializers.LecunNormal,
58
    initializers.LecunUniform,
59
    initializers.Orthogonal,
60
]
61

62
"""
63
In a reproducible model, the weights of the model should be initialized with
64
same values in subsequent runs. First, we'll check how initializers behave when
65
they are called multiple times with same `seed` value.
66
"""
67

68
for initializer in initializers_list:
69
    print(f"Running {initializer}")
70

71
    for iteration in range(2):
72
        # In order to get same results across multiple runs from an initializer,
73
        # you can specify a seed value.
74
        result = float(initializer(seed=42)(shape=(1, 1)))
75
        print(f"\tIteration --> {iteration} // Result --> {result}")
76
    print("\n")
77

78

79
"""
80
Now, let's inspect how two different initializer objects behave when they are
81
have the same seed value.
82
"""
83

84
# Setting the seed value for an initializer will cause two different objects
85
# to produce same results.
86
glorot_normal_1 = keras.initializers.GlorotNormal(seed=42)
87
glorot_normal_2 = keras.initializers.GlorotNormal(seed=42)
88

89
input_dim, neurons = 3, 5
90

91
# Call two different objects with same shape
92
result_1 = glorot_normal_1(shape=(input_dim, neurons))
93
result_2 = glorot_normal_2(shape=(input_dim, neurons))
94

95
# Check if the results are equal.
96
equal = np.allclose(result_1, result_2)
97
print(f"Are the results equal? {equal}")
98

99
"""
100
If the seed value is not set (or different seed values are used), two different
101
objects will produce different results. Since the random seed is set at the beginning
102
of the notebook, the results will be same in the sequential runs. This is related
103
to the `keras.utils.set_random_seed`.
104
"""
105

106
glorot_normal_3 = keras.initializers.GlorotNormal()
107
glorot_normal_4 = keras.initializers.GlorotNormal()
108

109
# Let's call the initializer.
110
result_3 = glorot_normal_3(shape=(input_dim, neurons))
111

112
# Call the second initializer.
113
result_4 = glorot_normal_4(shape=(input_dim, neurons))
114

115
equal = np.allclose(result_3, result_4)
116
print(f"Are the results equal? {equal}")
117

118
"""
119
`result_3` and `result_4` will be different, but when you run the notebook
120
again, `result_3` will have identical values to the ones in the previous run.
121
Same goes for `result_4`.
122
"""
123

124
"""
125
## Reproducibility in model training process
126
If you want to reproduce the results of a model training process, you need to
127
control the randomness sources during the training process. In order to show a
128
realistic example, this section utilizes `tf.data` using parallel map and shuffle
129
operations.
130

131
In order to start, let's create a simple function which returns the history
132
object of the Keras model.
133
"""
134

135

136
def train_model(train_data: tf.data.Dataset, test_data: tf.data.Dataset) -> dict:
137
    model = keras.Sequential(
138
        [
139
            layers.Conv2D(32, (3, 3), activation="relu"),
140
            layers.MaxPooling2D((2, 2)),
141
            layers.Dropout(0.2),
142
            layers.Conv2D(32, (3, 3), activation="relu"),
143
            layers.MaxPooling2D((2, 2)),
144
            layers.Dropout(0.2),
145
            layers.Conv2D(32, (3, 3), activation="relu"),
146
            layers.GlobalAveragePooling2D(),
147
            layers.Dense(64, activation="relu"),
148
            layers.Dropout(0.2),
149
            layers.Dense(10, activation="softmax"),
150
        ]
151
    )
152

153
    model.compile(
154
        optimizer="adam",
155
        loss="sparse_categorical_crossentropy",
156
        metrics=["accuracy"],
157
        jit_compile=False,
158
    )
159
    # jit_compile's default value is "auto" which will cause some problems in some
160
    # ops, therefore it's set to False.
161

162
    # model.fit has a `shuffle` parameter which has a default value of `True`.
163
    # If you are using array-like objects, this will shuffle the data before
164
    # training. This argument is ignored when `x` is a generator or
165
    # `tf.data.Dataset`.
166
    history = model.fit(train_data, epochs=2, validation_data=test_data)
167

168
    print(f"Model accuracy on test data: {model.evaluate(test_data)[1] * 100:.2f}%")
169

170
    return history.history
171

172

173
# Load the MNIST dataset
174
(train_images, train_labels), (
175
    test_images,
176
    test_labels,
177
) = keras.datasets.mnist.load_data()
178

179
# Construct tf.data.Dataset objects
180
train_ds = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
181
test_ds = tf.data.Dataset.from_tensor_slices((test_images, test_labels))
182

183
"""
184
Remember we called `tf.config.experimental.enable_op_determinism()` at the
185
beginning of the function. This makes the `tf.data` operations deterministic.
186
However, making `tf.data` operations deterministic comes with a performance
187
cost. If you want to learn more about it, please check this
188
[official guide](https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_op_determinism#determinism_and_tfdata).
189

190
Small summary what's going on here. Models have `kernel_initializer` and
191
`bias_initializer` parameters. Since we set random seeds using
192
`keras.utils.set_random_seed` in the beginning of the notebook, the initializers
193
will produce same results in the sequential runs. Additionally, TensorFlow
194
operations have now become deterministic. Frequently, you will be utilizing GPUs
195
that have thousands of hardware threads which causes non-deterministic behavior
196
to occur.
197
"""
198

199

200
def prepare_dataset(image, label):
201
    # Cast and normalize the image
202
    image = tf.cast(image, tf.float32) / 255.0
203

204
    # Expand the channel dimension
205
    image = tf.expand_dims(image, axis=-1)
206

207
    # Resize the image
208
    image = tf.image.resize(image, (32, 32))
209

210
    return image, label
211

212

213
"""
214
`tf.data.Dataset` objects have a `shuffle` method which shuffles the data.
215
This method has a `buffer_size` parameter which controls the size of the
216
buffer. If you set this value to `len(train_images)`, the whole dataset will
217
be shuffled. If the buffer size is equal to the length of the dataset,
218
then the elements will be shuffled in a completely random order.
219

220
Main drawback of setting the buffer size to the length of the dataset is that
221
filling the buffer can take a while depending on the size of the dataset.
222

223
Here is a small summary of what's going on here:
224
1) The `shuffle()` method creates a buffer of the specified size.
225
2) The elements of the dataset are randomly shuffled and placed into the buffer.
226
3) The elements of the buffer are then returned in a random order.
227

228
Since `tf.config.experimental.enable_op_determinism()` is enabled and we set
229
random seeds using `keras.utils.set_random_seed` in the beginning of the
230
notebook, the `shuffle()` method will produce same results in the sequential
231
runs.
232
"""
233
# Prepare the datasets, batch-map --> vectorized operations
234
train_data = (
235
    train_ds.shuffle(buffer_size=len(train_images))
236
    .batch(batch_size=64)
237
    .map(prepare_dataset, num_parallel_calls=tf.data.AUTOTUNE)
238
    .prefetch(buffer_size=tf.data.AUTOTUNE)
239
)
240

241
test_data = (
242
    test_ds.batch(batch_size=64)
243
    .map(prepare_dataset, num_parallel_calls=tf.data.AUTOTUNE)
244
    .prefetch(buffer_size=tf.data.AUTOTUNE)
245
)
246

247
"""
248
Train the model for the first time.
249
"""
250

251
history = train_model(train_data, test_data)
252

253
"""
254
Let's save our results into a JSON file, and restart the kernel. After
255
restarting the kernel, we should see the same results as the previous run,
256
this includes metrics and loss values both on the training and test data.
257
"""
258

259
# Save the history object into a json file
260
with open("history.json", "w") as fp:
261
    json.dump(history, fp)
262

263
"""
264
Do not run the cell above in order not to overwrite the results. Execute the
265
model training cell again and compare the results.
266
"""
267

268
with open("history.json", "r") as fp:
269
    history_loaded = json.load(fp)
270

271

272
"""
273
Compare the results one by one. You will see that they are equal.
274
"""
275
for key in history.keys():
276
    for i in range(len(history[key])):
277
        if not np.allclose(history[key][i], history_loaded[key][i]):
278
            print(f"{key} not equal")
279

280
"""
281
## Conclusion
282

283
In this tutorial, you learned how to control the randomness sources in Keras and
284
TensorFlow. You also learned how to reproduce the results of a model training
285
process.
286

287
If you want to initialize the model with the same weights everytime, you need to
288
set `kernel_initializer` and `bias_initializer` parameters of the layers and provide
289
a `seed` value to the initializer.
290

291
There still may be some inconsistencies due to numerical error accumulation such
292
as using `recurrent_dropout` in RNN layers.
293

294
Reproducibility is subject to the environment. You'll get the same results if you
295
run the notebook or the code on the same machine with the same environment.
296
"""
297

298
Product

Resources

Company