26. Deep Learning#
26.1. TensorFlow#
We are going to look at the TensorFlow example of building a neural network to “read” the handwritten numbers. This is addapted from https://www.tensorflow.org/tutorials/quickstart/beginner.
Auto encoder def, autodecoder def. Neural nets are basicaly combining PCA and model fitting.
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
TensorFlow version: 2.19.0
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
y_test
array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)
What is a sequential model? The docs for tf.keras.modles.Sequential can be found at https://www.tensorflow.org/api_docs/python/tf/keras/Sequential.
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
/Users/ben_rose/Documents/1 Projects/DataAnalysisBook/.venv/lib/python3.12/site-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
super().__init__(**kwargs)
model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ flatten (Flatten) │ (None, 784) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 128) │ 100,480 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 10) │ 1,290 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 101,770 (397.54 KB)
Trainable params: 101,770 (397.54 KB)
Non-trainable params: 0 (0.00 B)
# Returns "logits" or "log-odds" scores.
predictions = model(x_train[:1]).numpy()
print(predictions)
# convert to probabilities
print(tf.nn.softmax(predictions).numpy())
[[-0.13059187 0.27620077 -0.67851865 0.01817159 0.21402568 0.22630881
0.14778733 0.29927585 0.00093798 0.60923946]]
[[0.07590088 0.11400257 0.04388188 0.08807527 0.1071303 0.10845432
0.10026409 0.11666378 0.08657043 0.15905653]]
26.2. Training the Model#
Define the loss function. This compares the “truth values” with the predicted logits and allows the training rutine to know how accurate the current model is.
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
print(loss_fn(y_train[:1], predictions).numpy())
2.2214262
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
machine learning metrics: accuracy, precision, recall, … https://www.researchgate.net/figure/Calculation-of-Precision-Recall-and-Accuracy-in-the-confusion-matrix_fig3_336402347
# Model.fit adjusts the model parameters and minimizes the loss function
model.fit(x_train, y_train, epochs=5)
Epoch 1/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 907us/step - accuracy: 0.8631 - loss: 0.4693
Epoch 2/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 864us/step - accuracy: 0.9553 - loss: 0.1525
Epoch 3/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 884us/step - accuracy: 0.9668 - loss: 0.1098
Epoch 4/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 999us/step - accuracy: 0.9727 - loss: 0.0876
Epoch 5/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 2s 925us/step - accuracy: 0.9769 - loss: 0.0745
<keras.src.callbacks.history.History object at 0x14f1f15b0>
We have built a model/fuction that goes from an image input to an integer between 0 and 1.
model.evaluate(x_test, y_test, verbose=2)
313/313 - 0s - 518us/step - accuracy: 0.9780 - loss: 0.0711
[0.07112332433462143, 0.9779999852180481]
26.2.1. Getting Probabilities#
If you want to return probabilities, just wrap the model in softmax.
probability_model = tf.keras.Sequential([
model,
tf.keras.layers.Softmax()
])
print(probability_model.summary())
print(probability_model(x_test[:5]))
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ sequential (Sequential) │ (None, 10) │ 101,770 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ softmax (Softmax) │ (None, 10) │ 0 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 101,770 (397.54 KB)
Trainable params: 101,770 (397.54 KB)
Non-trainable params: 0 (0.00 B)
None
tf.Tensor(
[[1.63633743e-07 6.37588258e-08 4.52343954e-07 4.97519613e-05
1.47164676e-11 5.91744680e-08 5.77176111e-13 9.99947429e-01
5.27417342e-07 1.61631215e-06]
[2.84683299e-09 5.62214176e-04 9.99420822e-01 1.63003515e-05
1.07201750e-15 1.14133982e-07 5.84734172e-09 1.64443668e-13
4.99377563e-07 3.11501889e-14]
[1.57981799e-06 9.98859227e-01 6.49404392e-05 2.18148307e-05
1.12322905e-05 1.03776156e-05 7.43239725e-05 3.71358736e-04
5.83040412e-04 1.98861403e-06]
[9.99957979e-01 5.15877653e-11 1.53243400e-05 6.05400086e-09
6.96891362e-08 3.93252435e-08 1.60159652e-05 2.50211883e-06
3.10315329e-09 8.09647281e-06]
[5.64167840e-06 9.11033415e-10 6.56053953e-06 4.30376623e-08
9.95534420e-01 7.14883754e-07 4.97538565e-07 1.06153624e-04
1.96117117e-06 4.34398372e-03]], shape=(5, 10), dtype=float32)
26.3. Further Reading#
Beyond TensorFlow, another common tool is PyTorch.
%load_ext watermark
%watermark -untzvm -iv -w
Last updated: Fri May 02 2025 14:39:52CDT
Python implementation: CPython
Python version : 3.12.10
IPython version : 9.2.0
Compiler : Clang 16.0.0 (clang-1600.0.26.6)
OS : Darwin
Release : 24.4.0
Machine : arm64
Processor : arm
CPU cores : 12
Architecture: 64bit
rich : 14.0.0
tensorflow: 2.19.0
keras : 3.9.2
numpy : 2.1.3
pandas : 2.2.3
matplotlib: 3.10.1
Watermark: 2.5.0