26. Deep Learning#

26.1. TensorFlow#

We are going to look at the TensorFlow example of building a neural network to “read” the handwritten numbers. This is addapted from https://www.tensorflow.org/tutorials/quickstart/beginner.

Auto encoder def, autodecoder def. Neural nets are basicaly combining PCA and model fitting.

import tensorflow as tf
print("TensorFlow version:", tf.__version__)
TensorFlow version: 2.19.0
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
y_test

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

What is a sequential model? The docs for tf.keras.modles.Sequential can be found at https://www.tensorflow.org/api_docs/python/tf/keras/Sequential.

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])
/Users/ben_rose/Documents/1 Projects/DataAnalysisBook/.venv/lib/python3.12/site-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ flatten (Flatten)               │ (None, 784)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 128)            │       100,480 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 10)             │         1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 101,770 (397.54 KB)
 Trainable params: 101,770 (397.54 KB)
 Non-trainable params: 0 (0.00 B)
# Returns "logits" or "log-odds" scores.
predictions = model(x_train[:1]).numpy()
print(predictions)

# convert to probabilities
print(tf.nn.softmax(predictions).numpy())
[[-0.13059187  0.27620077 -0.67851865  0.01817159  0.21402568  0.22630881
   0.14778733  0.29927585  0.00093798  0.60923946]]
[[0.07590088 0.11400257 0.04388188 0.08807527 0.1071303  0.10845432
  0.10026409 0.11666378 0.08657043 0.15905653]]

26.2. Training the Model#

Define the loss function. This compares the “truth values” with the predicted logits and allows the training rutine to know how accurate the current model is.

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
print(loss_fn(y_train[:1], predictions).numpy())
2.2214262
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

machine learning metrics: accuracy, precision, recall, … https://www.researchgate.net/figure/Calculation-of-Precision-Recall-and-Accuracy-in-the-confusion-matrix_fig3_336402347

# Model.fit adjusts the model parameters and minimizes the loss function
model.fit(x_train, y_train, epochs=5)
Epoch 1/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 907us/step - accuracy: 0.8631 - loss: 0.4693
Epoch 2/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 864us/step - accuracy: 0.9553 - loss: 0.1525
Epoch 3/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 884us/step - accuracy: 0.9668 - loss: 0.1098
Epoch 4/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 999us/step - accuracy: 0.9727 - loss: 0.0876
Epoch 5/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 925us/step - accuracy: 0.9769 - loss: 0.0745

<keras.src.callbacks.history.History object at 0x14f1f15b0>

We have built a model/fuction that goes from an image input to an integer between 0 and 1.

model.evaluate(x_test,  y_test, verbose=2)
313/313 - 0s - 518us/step - accuracy: 0.9780 - loss: 0.0711

[0.07112332433462143, 0.9779999852180481]

26.2.1. Getting Probabilities#

If you want to return probabilities, just wrap the model in softmax.

probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])
print(probability_model.summary())
print(probability_model(x_test[:5]))
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                     Output Shape                  Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ sequential (Sequential)         │ (None, 10)             │       101,770 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ softmax (Softmax)               │ (None, 10)             │             0 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 101,770 (397.54 KB)
 Trainable params: 101,770 (397.54 KB)
 Non-trainable params: 0 (0.00 B)
None
tf.Tensor(
[[1.63633743e-07 6.37588258e-08 4.52343954e-07 4.97519613e-05
  1.47164676e-11 5.91744680e-08 5.77176111e-13 9.99947429e-01
  5.27417342e-07 1.61631215e-06]
 [2.84683299e-09 5.62214176e-04 9.99420822e-01 1.63003515e-05
  1.07201750e-15 1.14133982e-07 5.84734172e-09 1.64443668e-13
  4.99377563e-07 3.11501889e-14]
 [1.57981799e-06 9.98859227e-01 6.49404392e-05 2.18148307e-05
  1.12322905e-05 1.03776156e-05 7.43239725e-05 3.71358736e-04
  5.83040412e-04 1.98861403e-06]
 [9.99957979e-01 5.15877653e-11 1.53243400e-05 6.05400086e-09
  6.96891362e-08 3.93252435e-08 1.60159652e-05 2.50211883e-06
  3.10315329e-09 8.09647281e-06]
 [5.64167840e-06 9.11033415e-10 6.56053953e-06 4.30376623e-08
  9.95534420e-01 7.14883754e-07 4.97538565e-07 1.06153624e-04
  1.96117117e-06 4.34398372e-03]], shape=(5, 10), dtype=float32)

26.3. Further Reading#

%load_ext watermark
%watermark -untzvm -iv -w
Last updated: Fri May 02 2025 14:39:52CDT

Python implementation: CPython
Python version       : 3.12.10
IPython version      : 9.2.0

Compiler    : Clang 16.0.0 (clang-1600.0.26.6)
OS          : Darwin
Release     : 24.4.0
Machine     : arm64
Processor   : arm
CPU cores   : 12
Architecture: 64bit

rich      : 14.0.0
tensorflow: 2.19.0
keras     : 3.9.2
numpy     : 2.1.3
pandas    : 2.2.3
matplotlib: 3.10.1

Watermark: 2.5.0