📚 The CoCalc Library - books, templates and other resources
License: OTHER
Kernel: Python [conda env:py37]
In [1]:
Supervised Learning
Classification and Regression
Generalization, Overfitting, and Underfitting
Relation of Model Complexity to Dataset Size
Supervised Machine Learning Algorithms
Some Sample Datasets
In [2]:
Out[2]:
X.shape: (26, 2)
In [3]:
Out[3]:
Text(0, 0.5, 'Target')
In [4]:
Out[4]:
cancer.keys():
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])
In [5]:
Out[5]:
Shape of cancer data: (569, 30)
In [6]:
Out[6]:
Sample counts per class:
{'malignant': 212, 'benign': 357}
In [7]:
Out[7]:
Feature names:
['mean radius' 'mean texture' 'mean perimeter' 'mean area'
'mean smoothness' 'mean compactness' 'mean concavity'
'mean concave points' 'mean symmetry' 'mean fractal dimension'
'radius error' 'texture error' 'perimeter error' 'area error'
'smoothness error' 'compactness error' 'concavity error'
'concave points error' 'symmetry error' 'fractal dimension error'
'worst radius' 'worst texture' 'worst perimeter' 'worst area'
'worst smoothness' 'worst compactness' 'worst concavity'
'worst concave points' 'worst symmetry' 'worst fractal dimension']
In [8]:
Out[8]:
Data shape: (506, 13)
In [9]:
Out[9]:
X.shape: (506, 104)
k-Nearest Neighbors
k-Neighbors classification
In [10]:
Out[10]:
In [11]:
Out[11]:
In [12]:
In [13]:
In [14]:
Out[14]:
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=None, n_neighbors=3, p=2,
weights='uniform')
In [15]:
Out[15]:
Test set predictions: [1 0 1 0 1 0 0]
In [16]:
Out[16]:
Test set accuracy: 0.857
Analyzing KNeighborsClassifier
In [17]:
Out[17]:
<matplotlib.legend.Legend at 0x7f9df3d1e908>
In [18]:
Out[18]:
<matplotlib.legend.Legend at 0x7f9df3c61550>
k-neighbors regression
In [19]:
Out[19]:
In [20]:
Out[20]:
In [21]:
Out[21]:
KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=None, n_neighbors=3, p=2,
weights='uniform')
In [22]:
Out[22]:
Test set predictions:
[-0.054 0.357 1.137 -1.894 -1.139 -1.631 0.357 0.912 -0.447 -1.139]
In [23]:
Out[23]:
Test set R^2: 0.83
Analyzing KNeighborsRegressor
In [24]:
Out[24]:
<matplotlib.legend.Legend at 0x7f9df3fa74e0>
Strengths, weaknesses, and parameters
Linear Models
Linear models for regression
In [25]:
Out[25]:
w[0]: 0.393906 b: -0.031804
Linear regression aka ordinary least squares
In [26]:
In [27]:
Out[27]:
lr.coef_: [0.394]
lr.intercept_: -0.031804343026759746
In [28]:
Out[28]:
Training set score: 0.67
Test set score: 0.66
In [29]:
In [30]:
Out[30]:
Training set score: 0.95
Test set score: 0.61
Ridge regression
In [31]:
Out[31]:
Training set score: 0.89
Test set score: 0.75
In [32]:
Out[32]:
Training set score: 0.79
Test set score: 0.64
In [33]:
Out[33]:
Training set score: 0.93
Test set score: 0.77
In [34]:
Out[34]:
<matplotlib.legend.Legend at 0x7f9df22abd30>
In [35]:
Out[35]:
Lasso
In [36]:
Out[36]:
Training set score: 0.29
Test set score: 0.21
Number of features used: 4
In [37]:
Out[37]:
Training set score: 0.90
Test set score: 0.77
Number of features used: 33
In [38]:
Out[38]:
Training set score: 0.95
Test set score: 0.64
Number of features used: 96
In [39]:
Out[39]:
Text(0, 0.5, 'Coefficient magnitude')
Linear models for classification
In [40]:
Out[40]:
/home/andy/checkout/scikit-learn/sklearn/svm/base.py:922: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
"the number of iterations.", ConvergenceWarning)
<matplotlib.legend.Legend at 0x7f9df213ecf8>
In [41]:
Out[41]:
In [42]:
Out[42]:
Training set score: 0.953
Test set score: 0.958
In [43]:
Out[43]:
Training set score: 0.972
Test set score: 0.965
In [44]:
Out[44]:
Training set score: 0.934
Test set score: 0.930
In [45]:
Out[45]:
<matplotlib.legend.Legend at 0x7f9df1f88dd8>
In [46]:
Out[46]:
Training accuracy of l1 logreg with C=0.001: 0.91
Test accuracy of l1 logreg with C=0.001: 0.92
/home/andy/checkout/scikit-learn/sklearn/svm/base.py:922: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
"the number of iterations.", ConvergenceWarning)
Training accuracy of l1 logreg with C=1.000: 0.96
Test accuracy of l1 logreg with C=1.000: 0.96
Training accuracy of l1 logreg with C=100.000: 0.99
Test accuracy of l1 logreg with C=100.000: 0.98
<matplotlib.legend.Legend at 0x7f9df1f34048>
Linear models for multiclass classification
In [47]:
Out[47]:
<matplotlib.legend.Legend at 0x7f9df2173d30>
In [48]:
Out[48]:
Coefficient shape: (3, 2)
Intercept shape: (3,)
In [49]:
Out[49]:
<matplotlib.legend.Legend at 0x7f9df3ea9550>
In [50]:
Out[50]:
Text(0, 0.5, 'Feature 1')
Strengths, weaknesses and parameters
In [51]:
In [52]:
In [53]:
Naive Bayes Classifiers
In [54]:
In [55]:
Out[55]:
Feature counts:
{0: array([0, 1, 0, 2]), 1: array([2, 0, 2, 1])}
Strengths, weaknesses and parameters
Decision trees
In [56]:
Out[56]:
Building decision trees
In [57]:
Out[57]:
Controlling complexity of decision trees
In [58]:
Out[58]:
Accuracy on training set: 1.000
Accuracy on test set: 0.937
In [59]:
Out[59]:
Accuracy on training set: 0.988
Accuracy on test set: 0.951
Analyzing Decision Trees
In [60]:
In [61]:
Out[61]:
Feature Importance in trees
In [62]:
Out[62]:
Feature importances:
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.01 0.048
0. 0. 0.002 0. 0. 0. 0. 0. 0.727 0.046 0. 0.
0.014 0. 0.018 0.122 0.012 0. ]
In [63]:
Out[63]:
In [64]:
Out[64]:
Feature importances: [0. 1.]
In [65]:
Out[65]:
Text(0, 0.5, 'Price in $/Mbyte')
In [66]:
In [67]:
Out[67]:
<matplotlib.legend.Legend at 0x7f9df1e9d5c0>
Strengths, weaknesses and parameters
Ensembles of Decision Trees
Random forests
Building random forests
Analyzing random forests
In [68]:
Out[68]:
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=5, n_jobs=None,
oob_score=False, random_state=2, verbose=0, warm_start=False)
In [69]:
Out[69]:
[<matplotlib.lines.Line2D at 0x7f9df1e78c50>,
<matplotlib.lines.Line2D at 0x7f9df1e78080>]
In [70]:
Out[70]:
Accuracy on training set: 1.000
Accuracy on test set: 0.972
In [71]:
Out[71]:
Strengths, weaknesses, and parameters
Gradient Boosted Regression Trees (Gradient Boosting Machines)
In [72]:
Out[72]:
Accuracy on training set: 1.000
Accuracy on test set: 0.958
In [73]:
Out[73]:
Accuracy on training set: 0.991
Accuracy on test set: 0.972
In [74]:
Out[74]:
Accuracy on training set: 0.988
Accuracy on test set: 0.965
In [75]:
Out[75]:
Strengths, weaknesses and parameters
Kernelized Support Vector Machines
Linear Models and Non-linear Features
In [76]:
Out[76]:
Text(0, 0.5, 'Feature 1')
In [77]:
Out[77]:
/home/andy/checkout/scikit-learn/sklearn/svm/base.py:922: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
"the number of iterations.", ConvergenceWarning)
Text(0, 0.5, 'Feature 1')
In [78]:
Out[78]:
Text(0.5, 0, 'feature1 ** 2')
In [79]:
Out[79]:
/home/andy/checkout/scikit-learn/sklearn/svm/base.py:922: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
"the number of iterations.", ConvergenceWarning)
Text(0.5, 0, 'feature1 ** 2')
In [80]:
Out[80]:
Text(0, 0.5, 'Feature 1')
The Kernel Trick
Understanding SVMs
In [81]:
Out[81]:
Text(0, 0.5, 'Feature 1')
Tuning SVM parameters
In [82]:
Out[82]:
<matplotlib.legend.Legend at 0x7f9df14a4d68>
In [83]:
Out[83]:
Accuracy on training set: 1.00
Accuracy on test set: 0.63
In [84]:
Out[84]:
Text(0, 0.5, 'Feature magnitude')
Preprocessing data for SVMs
In [85]:
Out[85]:
Minimum for each feature
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0.]
Maximum for each feature
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1.]
In [86]:
In [87]:
Out[87]:
Accuracy on training set: 0.948
Accuracy on test set: 0.951
In [88]:
Out[88]:
Accuracy on training set: 0.988
Accuracy on test set: 0.972
Strengths, weaknesses and parameters
Neural Networks (Deep Learning)
The Neural Network Model
In [89]:
Out[89]:
In [90]:
Out[90]:
In [91]:
Out[91]:
Text(0, 0.5, 'relu(x), tanh(x)')
In [92]:
Out[92]:
Tuning Neural Networks
In [93]:
Out[93]:
Text(0, 0.5, 'Feature 1')
In [94]:
Out[94]:
Text(0, 0.5, 'Feature 1')
In [95]:
Out[95]:
Text(0, 0.5, 'Feature 1')
In [96]:
Out[96]:
Text(0, 0.5, 'Feature 1')
In [97]:
Out[97]:
In [98]:
Out[98]:
In [99]:
Out[99]:
Cancer data per-feature maxima:
[ 28.11 39.28 188.5 2501. 0.163 0.345 0.427 0.201
0.304 0.097 2.873 4.885 21.98 542.2 0.031 0.135
0.396 0.053 0.079 0.03 36.04 49.54 251.2 4254.
0.223 1.058 1.252 0.291 0.664 0.207]
In [100]:
Out[100]:
Accuracy on training set: 0.94
Accuracy on test set: 0.92
In [101]:
Out[101]:
Accuracy on training set: 0.991
Accuracy on test set: 0.965
/home/andy/checkout/scikit-learn/sklearn/neural_network/multilayer_perceptron.py:562: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.
% self.max_iter, ConvergenceWarning)
In [102]:
Out[102]:
Accuracy on training set: 1.000
Accuracy on test set: 0.972
In [103]:
Out[103]:
Accuracy on training set: 0.988
Accuracy on test set: 0.972
In [104]:
Out[104]:
<matplotlib.colorbar.Colorbar at 0x7f9df09cd710>
Strengths, weaknesses and parameters
Estimating complexity in neural networks
Uncertainty estimates from classifiers
In [105]:
Out[105]:
GradientBoostingClassifier(criterion='friedman_mse', init=None,
learning_rate=0.1, loss='deviance', max_depth=3,
max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=100,
n_iter_no_change=None, presort='auto', random_state=0,
subsample=1.0, tol=0.0001, validation_fraction=0.1,
verbose=0, warm_start=False)
The Decision Function
In [106]:
Out[106]:
X_test.shape: (25, 2)
Decision function shape: (25,)
In [107]:
Out[107]:
Decision function: [ 4.136 -1.702 -3.951 -3.626 4.29 3.662]
In [108]:
Out[108]:
Thresholded decision function:
[ True False False False True True False True True True False True
True False True False False False True True True True True False
False]
Predictions:
['red' 'blue' 'blue' 'blue' 'red' 'red' 'blue' 'red' 'red' 'red' 'blue'
'red' 'red' 'blue' 'red' 'blue' 'blue' 'blue' 'red' 'red' 'red' 'red'
'red' 'blue' 'blue']
In [109]:
Out[109]:
pred is equal to predictions: True
In [110]:
Out[110]:
Decision function minimum: -7.69 maximum: 4.29
In [111]:
Out[111]:
<matplotlib.legend.Legend at 0x7f9df09848d0>
Predicting Probabilities
In [112]:
Out[112]:
Shape of probabilities: (25, 2)
In [113]:
Out[113]:
Predicted probabilities:
[[0.016 0.984]
[0.846 0.154]
[0.981 0.019]
[0.974 0.026]
[0.014 0.986]
[0.025 0.975]]
In [114]:
Out[114]:
<matplotlib.legend.Legend at 0x7f9df0868358>
Uncertainty in multiclass classification
In [115]:
Out[115]:
GradientBoostingClassifier(criterion='friedman_mse', init=None,
learning_rate=0.01, loss='deviance', max_depth=3,
max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=100,
n_iter_no_change=None, presort='auto', random_state=0,
subsample=1.0, tol=0.0001, validation_fraction=0.1,
verbose=0, warm_start=False)
In [116]:
Out[116]:
Decision function shape: (38, 3)
Decision function:
[[-0.529 1.466 -0.504]
[ 1.512 -0.496 -0.503]
[-0.524 -0.468 1.52 ]
[-0.529 1.466 -0.504]
[-0.531 1.282 0.215]
[ 1.512 -0.496 -0.503]]
In [117]:
Out[117]:
Argmax of decision function:
[1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0 0 0 1 0 0 2 1
0]
Predictions:
[1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0 0 0 1 0 0 2 1
0]
In [118]:
Out[118]:
Predicted probabilities:
[[0.107 0.784 0.109]
[0.789 0.106 0.105]
[0.102 0.108 0.789]
[0.107 0.784 0.109]
[0.108 0.663 0.228]
[0.789 0.106 0.105]]
Sums: [1. 1. 1. 1. 1. 1.]
In [119]:
Out[119]:
Argmax of predicted probabilities:
[1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0 0 0 1 0 0 2 1
0]
Predictions:
[1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0 0 0 1 0 0 2 1
0]
In [120]:
Out[120]:
unique classes in training data: ['setosa' 'versicolor' 'virginica']
predictions: ['versicolor' 'setosa' 'virginica' 'versicolor' 'versicolor' 'setosa'
'versicolor' 'virginica' 'versicolor' 'versicolor']
argmax of decision function: [1 0 2 1 1 0 1 2 1 1]
argmax combined with classes_: ['versicolor' 'setosa' 'virginica' 'versicolor' 'versicolor' 'setosa'
'versicolor' 'virginica' 'versicolor' 'versicolor']