Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download

πŸ“š The CoCalc Library - books, templates and other resources

132927 views
License: OTHER
Kernel: Python 3

Credits: Forked from deep-learning-keras-tensorflow by Valerio Maggio

Theano

A language in a language

Dealing with weights matrices and gradients can be tricky and sometimes not trivial. Theano is a great framework for handling vectors, matrices and high dimensional tensor algebra. Most of this tutorial will refer to Theano however TensorFlow is another great framework capable of providing an incredible abstraction for complex algebra. More on TensorFlow in the next chapters.

import theano import theano.tensor as T

Symbolic variables

Theano has it's own variables and functions, defined the following

x = T.scalar()
x

Variables can be used in expressions

y = 3*(x**2) + 1

y is an expression now

Result is symbolic as well

type(y) y.shape
Shape.0

#####printing

As we are about to see, normal printing isn't the best when it comes to theano

print(y)
Elemwise{add,no_inplace}.0
theano.pprint(y)
'((TensorConstant{3} * (<TensorType(float32, scalar)> ** TensorConstant{2})) + TensorConstant{1})'
theano.printing.debugprint(y)
Elemwise{add,no_inplace} [@A] '' |Elemwise{mul,no_inplace} [@B] '' | |TensorConstant{3} [@C] | |Elemwise{pow,no_inplace} [@D] '' | |<TensorType(float32, scalar)> [@E] | |TensorConstant{2} [@F] |TensorConstant{1} [@G]

Evaluating expressions

Supply a dict mapping variables to values

y.eval({x: 2})
array(13.0, dtype=float32)

Or compile a function

f = theano.function([x], y)
f(2)
array(13.0, dtype=float32)

Other tensor types

X = T.vector() X = T.matrix() X = T.tensor3() X = T.tensor4()

Automatic differention

  • Gradients are free!

x = T.scalar() y = T.log(x)
gradient = T.grad(y, x) print gradient print gradient.eval({x: 2}) print (2 * gradient)
Elemwise{true_div}.0 0.5 Elemwise{mul,no_inplace}.0

Shared Variables

  • Symbolic + Storage

import numpy as np x = theano.shared(np.zeros((2, 3), dtype=theano.config.floatX))
x
<CudaNdarrayType(float32, matrix)>

We can get and set the variable's value

values = x.get_value() print(values.shape) print(values)
(2, 3) [[ 0. 0. 0.] [ 0. 0. 0.]]
x.set_value(values)

Shared variables can be used in expressions as well

(x + 2) ** 2
Elemwise{pow,no_inplace}.0

Their value is used as input when evaluating

((x + 2) ** 2).eval()
array([[ 4., 4., 4.], [ 4., 4., 4.]], dtype=float32)
theano.function([], (x + 2) ** 2)()
array([[ 4., 4., 4.], [ 4., 4., 4.]], dtype=float32)

Updates

  • Store results of function evalution

  • dict mapping shared variables to new values

count = theano.shared(0) new_count = count + 1 updates = {count: new_count} f = theano.function([], count, updates=updates)
f()
array(0)
f()
array(1)
f()
array(2)

Warming up! Logistic Regression

%matplotlib inline
import numpy as np import pandas as pd import theano import theano.tensor as T import matplotlib.pyplot as plt from sklearn.preprocessing import StandardScaler from sklearn.preprocessing import LabelEncoder from keras.utils import np_utils
Using Theano backend.

For this section we will use the Kaggle otto challenge. If you want to follow, Get the data from Kaggle: https://www.kaggle.com/c/otto-group-product-classification-challenge/data

About the data

The Otto Group is one of the world’s biggest e-commerce companies, A consistent analysis of the performance of products is crucial. However, due to diverse global infrastructure, many identical products get classified differently. For this competition, we have provided a dataset with 93 features for more than 200,000 products. The objective is to build a predictive model which is able to distinguish between our main product categories. Each row corresponds to a single product. There are a total of 93 numerical features, which represent counts of different events. All features have been obfuscated and will not be defined any further.

https://www.kaggle.com/c/otto-group-product-classification-challenge/data

def load_data(path, train=True): """Load data from a CSV File Parameters ---------- path: str The path to the CSV file train: bool (default True) Decide whether or not data are *training data*. If True, some random shuffling is applied. Return ------ X: numpy.ndarray The data as a multi dimensional array of floats ids: numpy.ndarray A vector of ids for each sample """ df = pd.read_csv(path) X = df.values.copy() if train: np.random.shuffle(X) # https://youtu.be/uyUXoap67N8 X, labels = X[:, 1:-1].astype(np.float32), X[:, -1] return X, labels else: X, ids = X[:, 1:].astype(np.float32), X[:, 0].astype(str) return X, ids
def preprocess_data(X, scaler=None): """Preprocess input data by standardise features by removing the mean and scaling to unit variance""" if not scaler: scaler = StandardScaler() scaler.fit(X) X = scaler.transform(X) return X, scaler def preprocess_labels(labels, encoder=None, categorical=True): """Encode labels with values among 0 and `n-classes-1`""" if not encoder: encoder = LabelEncoder() encoder.fit(labels) y = encoder.transform(labels).astype(np.int32) if categorical: y = np_utils.to_categorical(y) return y, encoder
print("Loading data...") X, labels = load_data('train.csv', train=True) X, scaler = preprocess_data(X) Y, encoder = preprocess_labels(labels) X_test, ids = load_data('test.csv', train=False) X_test, ids = X_test[:1000], ids[:1000] #Plotting the data print(X_test[:1]) X_test, _ = preprocess_data(X_test, scaler) nb_classes = Y.shape[1] print(nb_classes, 'classes') dims = X.shape[1] print(dims, 'dims')
Loading data... [[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 0. 0. 0. 3. 2. 1. 0. 0. 0. 0. 0. 0. 0. 5. 3. 1. 1. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 0. 0. 0. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 11. 1. 20. 0. 0. 0. 0. 0.]] (9L, 'classes') (93L, 'dims')

Now lets create and train a logistic regression model.

Hands On - Logistic Regression

#Based on example from DeepLearning.net rng = np.random N = 400 feats = 93 training_steps = 1 # Declare Theano symbolic variables x = T.matrix("x") y = T.vector("y") w = theano.shared(rng.randn(feats), name="w") b = theano.shared(0., name="b") # Construct Theano expression graph p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b)) # Probability that target = 1 prediction = p_1 > 0.5 # The prediction thresholded xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize gw, gb = T.grad(cost, [w, b]) # Compute the gradient of the cost # (we shall return to this in a # following section of this tutorial) # Compile train = theano.function( inputs=[x,y], outputs=[prediction, xent], updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)), allow_input_downcast=True) predict = theano.function(inputs=[x], outputs=prediction, allow_input_downcast=True) #Transform for class1 y_class1 = [] for i in Y: y_class1.append(i[0]) y_class1 = np.array(y_class1) # Train for i in range(training_steps): print('Epoch %s' % (i+1,)) pred, err = train(X, y_class1) print("target values for Data:") print(y_class1) print("prediction on training set:") print(predict(X))
Epoch 1 target values for Data: [ 0. 0. 1. ..., 0. 0. 0.] prediction on training set: [0 0 0 ..., 0 0 0]